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PREFACE 


Excerpts from the Preface to the First Edition 


There seems to be no general agreement as to what should constitute a first course in 
calculus and analytic geometry. Some people insist that the only way to really understand 
calculus is to start off with a thorough treatment of the real-number system and develop 
the subject step by step in a logical and rigorous fashion. Others argue that calculus is 
primarily a tool for engineers and physicists; they believe the course should stress applica- 
tions of the calculus hy appeal to intuition and hy extensive drill on problems which develop 
manipulative skills. There is much that is sound in both these points of view. Calculus is 
a deductive science and a branch of pure mathematics. At the same time, it is very impor- 
tant to remember that calculus has strong roots in physical problems and that it derives 
much of its power and beauty from the variety of its applications. It is possible to combine 
a strong theoretical development with sound training in technique; this book represents 
an attempt to strike a sensible balance between the two. While treating the calculus as a 
deductive science, the book does not neglect applications to physical problems. Proofs of 
all the important theorems are presented as an essential part of the growth of mathematical 
ideas; the proofs are often preceded by a geometric or intuitive discussion to give the 
student some insight into why they take a particular form. Although these intuitive dis- 
cussions will satisfy readers who are not interested in detailed proofs, the complete proofs 
are also included for those who prefer a more rigorous presentation. 

The approach in this hook has been suggested by the historical and philosophical develop- 
ment of calculus and analytic geometry. For example, integration is treated before 
differentiation. Although to some this may seem unusual, it is historically correct and 
pedagogically sound. Moreover, it is the best way to make meaningful the true connection 
between the integral and the derivative. 

The concept of the integral is defined first for step functions. Since the integral of a step 
function is merely a finite sum, integration theory in this case is extremely simple. As the 
student learns the properties of the integral for step functions, he gains experience in the 
use of the summation notation and at the same time becomes familiar with the notation 
for integrals. This sets the stage so that the transition from step functions to more general 
functions seems easy and natural. 
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Preface 


Preface to the Second Edition 

The second edition differs from the first in many respects. Linear algebra has been 
incorporated, the mean-value theorems and routine applications of calculus are introduced 
at an earlier stage, and many new and easier exercises have been added. A glance at the 
table of contents reveals that the book has been divided into smaller chapters, each centering 
on an important concept. Several sections have been rewritten and reorganized to provide 
better motivation and to improve the flow of ideas. 

As in the first edition, a historical introduction precedes each important new concept, 
tracing its development from an early intuitive physical notion to its precise mathematical 
formulation. The student is told something of the struggles of the past and of the triumphs 
of the men who contributed most to the subject. Thus the student becomes an active 
participant in the evolution of ideas rather than a passive observer of results. 

The second edition, like the first, is divided into two volumes. The first two thirds of 
Volume 1 deals with the calculus of functions of one variable, including infinite series and 
an introduction to differential equations. The last third of Volume 1 introduces linear 
algebra with applications to geometry and analysis. Much of this material leans heavily 
on the calculus for examples that illustrate the general theory. It provides a natural 
blending of algebra and analysis and helps pave the way for the transition from one- 
variable calculus to multivariable calculus, discussed in Volume II. Further development 
of linear algebra will occur as needed in the second edition of Volume II. 

Once again 1 acknowledge with pleasure my debt to Professors H. F. Bohnenblust, 
A. Erdelyi, F. B. Fuller, K. Hoffman, G. Springer, and H. S. Zuckerman. Their influence 
on the first edition continued into the second. In preparing the second edition, 1 received 
additional help from Professor Basil Gordon, who suggested many improvements. Thanks 
are also due George Springer and William P. Ziemer, who read the final draft. The staff 
of the Blaisdell Publishing Company has, as always, been helpful; 1 appreciate their sym- 
pathetic consideration Of my wishes concerning format and typography. 

Finally, it gives me special pleasure to express my gratitude to my wife for the many ways 
she has contributed during the preparation of both editions. In grateful acknowledgment 
1 happily dedicate this book to her. 

T. M. A. 

Pasadena, California 
September 16, 1966 
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Calculus 




INTRODUCTION 


Part 1 . Historical Introduction 


11.1 The two basic concepts of calculus 

The remarkable progress that has been made in science and technology during the last 
Century is due in large part to the development of mathematics. That branch of mathematics 
known as integral and differential calculus serves as a natural and powerful tool for attacking 
a variety of problems that arise in physics, astronomy, engineering, chemistry, geology, 
biology, and other fields including, rather recently, some of the social sciences. 

To give the reader an idea of the many different types of problems that can be treated by 
the methods of calculus, we list here a few sample questions selected from the exercises that 
occur in later chapters of this book. 

With what speed should a rocket be fired upward so that it never returns to earth? What 
is the radius of the smallest circular disk that can cover every isosceles triangle of a given 
perimeter L? What volume of material is removed from a solid sphere of radius 2r if a hole 
of radius r is drilled through the center ? If a strain of bacteria grows at a rate proportional 
to the amount present and if the population doubles in one hour, by how much will it 
increase at the end of two hours? If a ten-pound force stretches an elastic spring one inch, 
how much work is required to stretch the spring one foot ? 

These examples, chosen from various fields, illustrate some of the technical questions that 
can be answered by more or less routine applications of calculus. 

Calculus is more than a technical tool-it is a collection of fascinating and exciting ideas 
that have interested thinking men for centuries. These ideas have to do with speed, area, 
volume, rate of growth, continuity, tangent line, and other concepts from a variety of fields. 
Calculus forces us to stop and think carefully about the meanings of these concepts. Another 
remarkable feature of the subject is its unifying power. Most of these ideas can be formu- 
lated so that they revolve around two rather specialized problems of a geometric nature. W e 
turn now to a brief description of these problems. 

Consider a curve C which lies above a horizontal base line such as that shown in Figure 
1.1. We assume this curve has the property that every vertical line intersects it once at most. 
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The shaded portion of the figure consists of those points which lie below the curve C, above 
the horizontal base, and between two parallel vertical segments joining C to the base. The 
first fundamental problem of calculus is this : To assign a number which measures the area 
of this shaded region. 

Consider next a line drawn tangent to the curve, as shown in Figure 1.1. The second 
fundamental problem may be stated as follows: To assign a number which measures the 
steepness of this line. 



Basically, calculus has to do with the precise formulation and solution of these two 
special problems. It enables us to define the concepts of area and tangent line and to cal- 
culate the area of a given region or the steepness of a given tangent line. Integral calculus 
deals with the problem of area and will be discussed in Chapter 1. Differential calculus deals 
with the problem of tangents and will be introduced in Chapter 4. 

The study of calculus requires a certain mathematical background. The present chapter 
deals with fhis background material and is divided into four parts : Part 1 provides historical 
perspective; Part 2 discusses some notation and terminology from the mathematics of sets; 
Part 3 deals with the real-number system; Part 4 treats mathematical induction and the 
summation notation. If the reader is acquainted with these topics, he can proceed directly 
to the development of integral calculus in Chapter 1. If not, he should become familiar 
with the material in the unstarred sections of this Introduction before proceeding to 
Chapter 1. 

11.2 Historical background 

The birth of integral calculus occurred more than 2000 years ago when the Greeks 
attempted to determine areas by a process which they called the method ofexhaustion. The 
essential ideas of this method are very simple and can be described briefly as follows: Given 
a region whose area is to be determined, we inscribe in it a polygonal region which approxi- 
mates the given region and whose area we can easily compute. Then we choose another 
polygonal region which gives a better approximation, and we continue the process, taking 
polygons with more and more sides in an attempt to exhaust the given region. The method 
is illustrated for a semicircular region in Figure 1.2. It was used successfully by Archimedes 
(287-212 b.c.) to find exact formulas for the area of a circle and a few other special figures. 




The method of exhaustion for the area of a parabolic segment 
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The development of the method of exhaustion beyond the point to which Archimedes 
carried it had to wait nearly eighteen centuries until the use of algebraic symbols and 
techniques became a standard part of mathematics. The elementary algebra that is familiar 
to most high-school students today was completely unknown in Archimedes’ time, and it 
would have been next to impossible to extend his method to any general class of regions 
without some convenient way of expressing rather lengthy calculations in a compact and 
simplified form. 

A slow but revolutionary change in the development of mathematical notations began 
in the 16th Century a.d. The cumbersome system of Roman numerals was gradually dis- 
placed by the Hindu-Arabie characters used today, the symbols + and — were introduced 
for the first time, and the advantages of the decimal notation began to be recognized. 
During this same period, the brilliant successes of the Italian mathematicians Tartaglia, 



Figure 1.2 The method of exhaustion applied to a semicircular region. 



Cardano, and Ferrari in finding algebraic solutions of cubic and quartic equations stimu- 
lated a great deal of activity in mathematics and encouraged the growth and acceptance of a 
new and superior algebraic language. With the widespread introduction of well-chosen 
algebraic symbols, interest was revived in the ancient method of exhaustion and a large 
number of fragmentary results were discovered in the 16th Century by such pioneers as 
Cavalieri, Toricelli, Roberval, Fermat, Pascal, and Wallis. 

Gradually the method of exhaustion was transformed into the subject now called integral 
calculus, a new and powerful discipline with a large variety of applications, not only to 
geometrical problems concerned with areas and volumes but also to problems in other 
sciences. This branch of mathematics, which retained some of the original features of the 
method of exhaustion, received its biggest impetus in the 17th Century, largely due to the 
efforts of Isaac Newton (1642-1727) and Gottfried Leibniz (1646-1716), and its develop- 
ment continued well into the 19th Century before the subject was put on a firm mathematical 
basis by such men as Augustin-Louis Cauchy (1789-1857) and Bernhard Riemann (1826- 
1866). Further refinements and extensions of the theory are still being carried out i n 
contemporary mathematics. 


11.3 The method of exhaustion for the area of a parabolic segment 

Before we proceed to a systematic treatment of integral calculus, it will be instructive 
to apply the method of exhaustion directly to one of the special figures treated by Archi- 
medes himself. The region in question is shown in Figure 1.3 and can be described as 
follows: If we choose an arbitrary point on the base of this figure and denote its distance 
from 0 by x, then the vertical distance from this point to the curve is x 2 . In particular, if 
the length of the base itself is b, the altitude of the figure is b 2 . The vertical distance from 
x to the curve is called the “ordinate” at x. The curve itself is an example of what is known 
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Figure 1.3 A parabolic Figure 1.4 

segment. 


as a parabola. The region bounded by it and the two line segments is called a parabolic 
segment. 

This figure may be enclosed in a rectangle of base b and altitude b 2 , as shown in Figure 1 .3. 
Examination of the figure suggests that the area of the parabolic segment is less than half 
the area of the rectangle. Archimedes made the surprising discovery that the area of the 
parabolic segment is exactly one-third that of the rectangle; that is to say, A = A 3 / 3, where 
A denotes the area of the parabolic segment. We shall show presently how to arrive at this 
result. 

It should be pointed out that the parabolic segment in Figure 1.3 is not shown exactly as 
Archimedes drew it and the details that follow are not exactly the same as those used by him. 



Figure 1 .5 Calculation of the area of a parabolic segment. 
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Nevertheless, the essential ideas are those of Archimedes; what is presented here is the 
method of exhaustion in modern notation. 

The method is simply this: We slice the figure into a number of strips and obtain two 
approximations to the region, one from below and one from above, by using two sets of 
rectangles as illustrated in Figure 1.4. (We use rectangles rather than arbitrary polygons to 
simplify the computations.) The area of the parabolic segment is larger than the total area 
of the inner rectangles but smaller than that of the outer rectangles. 

If each strip is further subdivided to obtain a new approximation with a larger number 
of strips, the total area of the inner rectangles increases, whereas the total area of the outer 
rectangles decreases. Archimedes realized that an approximation to the area within any 
desired degree of accuracy could be obtained by simply taking enough strips. 

Let us carry out the actual computations that are required in this case. For the sake of 
simplicity, we subdivide the base into n equal parts, each of length bfn (see Figure 1.5). The 
points of subdivision correspond to the following values of x: 


b 2b 3b 

) 9 J 

n n n 


(n — 1 )b nb^ u 
, — o 

n n 


A typical point of subdivision corresponds to X = kb/n, where k takes the successive values 
k = 0, 1, 2, 3, . . . , n. At each point kb jit we construct the outer rectangle of altitude ( kb/n ) 2 
as illustrated in Figure 1.5. The area of this rectangle is the product of its base and altitude 
and is equal to 





Let us denote by S n the sum of the areas of all the outer rectangles. Then since the kth 
rectangle has area ( b 3 ln 3 )k 2 , we obtain the formula 


(ID 



(1 2 + 2 2 + 3 2 + . ■ . + „*) . 


In the same way we obtain a formula for the sum s n of all the inner rectangles: 

(1.2) s» = ^ [l 2 + 2 2 + 3 2 + ■■ . + (n - 1 ) 2 ] . 

This brings us to a very important stage in the calculation. Notice that the factor multi- 
plying b 3 jn 3 in Equation (1.1) is the sum of the squares of the first n integers: 

l 2 + 2 2 + ' • - + « 2 . 


[The corresponding factor in Equation (1.2) is similar except that the sum has only n — 1 
terms.] For a large value of n, the computation of this sum by direct addition of its terms is 
tedious and inconvenient. Fortunately there is an interesting identity which makes it possible 
to evaluate this sum in a simpler way, namely, 


, 2 n 

+ n = 


* + - + ;■ 
3 2 6 


(1.3) 


1 2 + 2 2 + 


4 I 4 
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This identity is valid for every integer n > 1 and can be proved as follows: Start with the 
formula (k + l) 3 = k 3 + 3k 2 + 3k + 1 and rewrite it in the form 

3 k 2 + 3k + 1 = (jfc + l) 3 - k\ 

Taking k — 1 , 2 ,..., n — 1, we get n — 1 formulas 


3 • l 2 + 3 ■ 1 + 1 = 2 3 - l 3 
3 • 2 2 + 3 • 2 + 1 = 3 3 — 2 3 


3(n — l) 2 + 3(n — 1) + 1 = « 3 — (n — l) 3 

When we add these formulas, all the terms on the right cancel except two and we obtain 

3[1 2 + 2 2 + ■ • • + (n - l) 2 ] + 3[1 + 2+ . . . + (n - 1)] + (n - 1) = n 3 - l 3 . 

The second sum on the left is the sum of terms in an arithmetic progression and it simplifies 
to | n(n — 1). Therefore this last equation gives us 

(1.4) l 2 + 2 2 + • • • + (n - l) 2 = - - - + - . 

Adding n 2 to both members, we obtain (1.3). 

For our purposes, we do not need the exact expressions given in the right-hand members 
of (1.3) and (1.4). All we need are the two inequalities 


(1.5) l 2 + 2 2 + • • • + (n - l)” < j < l 2 + 2 2 + • • • + n 2 

which are valid for every integer n > 1. These inequalities can de deduced easily as con- 
sequences of (1.3) and (1.4), or they can be proved directly by induction. (A proof by 
induction is given in Section 14.1.) 

If we multiply both inequalities in (1.5) by b 2 j /) 3 and make use of (1.1) and (1.2), we obtain 


( 1 . 6 ) 


Sn 



for every n. The inequalities in (1.6) tell us that b s j3 is a number which lies between s n and 
S n for every n, We will now prove that b 3 j 3 is the only number which has this property. In 
other words, we assert that if A is any number which satisfies the inequalities 


(1.7) Sn < A <s n 

for every positive integer n, then A = b 3 j 3. It is because of this fact that Archimedes 
concluded that the area of the parabolic segment is b 3 j 3. 
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To prove that A = b 3 / 3, we use the inequalities in (1.5) once more. Adding n 2 to both 
sides of the leftmost inequality in (1.5), we obtain 


l 2 + 2 2 + • - ■ + n 2 < — + n 2 . 

3 

Multiplying this by 6 3 /n 3 and using (1.1), we find 

(1.8) S n <^ + ~. 

3 n 

Similarly, by subtracting n 2 from both side: of the rightmost inequality in (1.5) and multi- 
plying by b 3 jn 3 , we are led to the inequality 


(1.9) 


b 3 

3 


b 3 

n 


< s n . 


Therefore, any number A satisfying (1.7) must also satisfy 


( 1 . 10 ) 


b 3 b 3 ^ b 3 b 3 

— < A < — + ~ 

3 n 3 n 


for every integer n~> 1. Now there are only three possibilities: 



A 


h _ 3 

3 ' 


If we show that each of the first two leads to a contradiction, then we must have A = 6 3 /3, 
since, in the manner of Sherlock Holmes, this exhausts all the possibilities. 

Suppose the inequality A > b 3 l 3 were true. From the second inequality in (1.10) we 
obtain 


( 1 - 11 ) 



for every integer n > 1 . Since A — b 3 j 3 is positive, we may divide both sides of ( 1 . 1 1 ) by 
A — b 3 1 3 and then multiply by n to obtain the equivalent statement 


n < 


b 3 

A - b 3 /3 


for every n. But this inequality is obviously false when n > b 3 l(A — 6 3 /3). Hence the 
inequality A > b 3 j 3 leads to a contradiction. By a similar argument, we can show that the 
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inequality A < 6 3 /3 also leads to a contradiction, and therefore we must have A = Z> 3 / 3, 
as asserted. 

*1 1.4 Exercises 

1. (a) Modify the region in Figure 1.3 by assuming that the ordinate at each x is 2x 2 instead of 
x 2 . Draw the new figure. Check through the principal steps in the foregoing section and 
find what effect this has on the calculation of the area. Do the same if the ordinate at each x is 
(b) 3x 2 , (c) lx 2 , (d) lx 2 + l, (e) ax 2 + c. 

2. Modify the region in Figure 1.3 by assuming that the ordinate at each x is X s instead of x 2 . 
Draw the new figure. 

(a) Use a construction similar to that illustrated in Figure 1.5 and show that the outer and inner 
sums S n and s n are given by 

b‘ l b* 

S n = d 3 + 2 3 + ■ • . + n 3 ), s n = [l 3 + 23 + . . . + (n - 1)*]. 

nr n 1 

(b) Use the inequalities (which can be proved by mathematical induction; see Section 14.2) 

(1.12) l 3 + 2 3 + • • ■ + (n - l) 3 < ^ < l 3 + 2 3 + • • • + n 3 

to show that s n < fr*/4 < S n for every n, and prove that fr*/4 is the only number which lies 
between s n and S n tor every n. 

(c) What number takes the place of 6 4 / 4 if the ordinate at each x is ax 3 + c? 

3. The inequalities (1.5) and (1.12) are special cases of the more general inequalities 

„lc+l 

(1.13) r + 2 1)” < - <l k + T + . . . + n k 

ft 4 1 

that are valid for every integer n > 1 and every integer k > 1. Assume the validity of (1.13) 
and generalize the results of Exercise 2. 


11.5 A critical analysis of Archimedes’ method 

From calculations similar to those in Section 1 1.3, Archimedes concluded that the area 
of the parabolic segment in question is b 3 /3. This fact was generally accepted as a mathe- 
matical theorem for nearly 2000 years before it was realized that One must re-examine 
the result from a more critical point of view. To understand why anyone would question 
the validity of Archimedes’ conclusion, it is necessary to know something about the important 
changes that have taken place in the recent history of mathematics. 

Every branch of knowledge is a collection of ideas described by means of words and 
symbols, and one Cannot understand these ideas unless one knows the exact meanings of 
the words and symbols that are used. Certain branches of knowledge, known as deductive 
systems, are different from others in that a number of “undefined” concepts are chosen 
in advance and all other concepts in the system are defined in terms of these. Certain 
statements about these undefined concepts are taken as axioms or postulates and other 
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statements that can be deduced from the axioms are called theorems. The most familiar 
example of a deductive system is the Euclidean theory of elementary geometry that has 
been studied by well-educated men since the time of the ancient Greeks. 

The spirit of early Greek mathematics, with its emphasis on the theoretical and postu- 
lational approach to geometry as presented in Euclid’s Elements, dominated the thinking 
of mathematicians until the time of the Renaissance. A new and vigorous phase in the 
development of mathematics began with the advent of algebra in the 16th Century, and 
the next 300 years witnessed a flood of important discoveries. Conspicuously absent from 
this period was the logically precise reasoning of the deductive method with its use of 
axioms, definitions, and theorems. Instead, the pioneers in the 16th, 17th, and 18th cen- 
turies resorted to a curious blend of deductive reasoning combined with intuition, pure 
guesswork, and mysticism, and it is not surprising to find that some of their work was 
later shown to be incorrect. However, a surprisingly large number of important discoveries 
emerged from this era, and a great deal of the work has survived the test of history-a 
tribute to the unusual skill and ingenuity of these pioneers. 

As the flood of new discoveries began to recede, a new and more critical period emerged. 
Little by little, mathematicians felt forced to return to the classical ideals of the deductive 
method in an attempt to put the new mathematics on a firm foundation. This phase of the 
development, which began early in the 19th Century and has continued to the present day, 
has resulted in a degree of logical purity and abstraction that has surpassed all the traditions 
of Greek science. At the same time, it has brought about a clearer understanding of the 
foundations of not only calculus but of all of mathematics. 

There are many ways to develop calculus as a deductive system. One possible approach 
is to take the real numbers as the undefined objects. Some of the rules governing the 
operations on real numbers may then be taken as axioms. One such set of axioms is listed 
in Part 3 of this Introduction. New concepts, such as integral, limit, continuity, derivative, 
must then be defined in terms of real numbers. Properties of these concepts are then 
deduced as theorems that follow from the axioms. 

Looked at as part of the deductive system of calculus, Archimedes’ result about the area 
of a parabolic segment cannot be accepted as a theorem until a satisfactory definition of 
area is given first. It is not clear whether Archimedes had ever formulated a precise defini- 
tion of what he meant by area. He seems to have taken it for granted that every region has an 
area associated with it. On this assumption he then set out to calculate areas of particular 
regions. In his calculations he made use of certain facts about area that cannot be proved 
until we know what is meant by area. For instance, he assumed that if one region lies inside 
another, the area of the smaller region cannot exceed that of the larger region. Also, if a 
region is decomposed into two or more parts, the sum of the areas of the individual parts is 
equal to the area of the whole region. All these are properties we would like area to possess, 
and we shall insist that any definition of area should imply these properties. It is quite 
possible that Archimedes himself may have taken area to be an undefined concept and then 
used the properties we just mentioned as axioms about area. 

Today we consider the work of Archimedes as being important not so much because it 
helps us to compute areas of particular figures, but rather because it suggests a reasonable 
way to define the concept of area for more or less arbitrary figures. As it turns out, the 
method of Archimedes suggests a way to define a much more general concept known as the 
integral. The integral, in turn, is used to compute not only area but also quantities such as 
arc length, volume, work and others. 
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If we look ahead and make use of the terminology of integral calculus, the result of the 
calculation carried out in Section 1 1.3 for the parabolic segment is often stated as follows : 

“The integral of x 2 from 0 to b is & 3 /3.” 

It is written symbolically as 



The symbol j (an elongated S ) is called an integral sign, and it was introduced by Leibniz 
in 1675. The process which produces the number 6 3 /3 is called integration. The numbers 
0 and b which are attached to the integral sign are referred to as the limits of integration. 
The symbol jjj x 2 dx must be regarded as a whole. Its definition will treat it as such, just 
as the dictionary describes the word “lapidate” without reference to “lap,” “id,” or “ate.” 

Leibniz’ symbol for the integral was readily accepted by many early mathematicians 
because they liked to think of integration as a kind of “summation process” which enabled 
them to add together infinitely many “infinitesimally small quantities.” For example, the 
area Of the parabolic segment was conceived of as a sum of infinitely many infinitesimally 
thin rectangles of height x 2 and base dx. The integral sign represented the process of adding 
the areas of all these thin rectangles. This kind of thinking is suggestive and often very 
helpful, but it is not easy to assign a precise meaning to the idea of an “infinitesimally small 
quantity.” Today the integral is defined in terms of the notion of real number without 
using ideas like “infinitesimals.” This definition is given in Chapter 1. 

11.6 The approach to calculus to be used in this book 

A thorough and complete treatment of either integral or differential calculus depends 
ultimately on a careful study of the real number system. This study in itself, when carried 
out in full, is an interesting but somewhat lengthy program that requires a small volume 
for its complete exposition. The approach in this book is to begin with the real numbers 
as undefined objects and simply to list a number of fundamental properties of real numbers 
which we shall take as axioms. These axioms and some of the simplest theorems that can 
be deduced from them are discussed in Part 3 of this chapter. 

Most of the properties of real numbers discussed here are probably familiar to the reader 
from his study of elementary algebra. However, there are a few properties of real numbers 
that do not ordinarily corne into consideration in elementary algebra but which play an 
important role in the calculus. These properties stem from the so-called least-upper-bound 
axiom (also known as the completeness or continuity axiom ) which is dealt with here in some 
detail. The reader may wish to study Part 3 before proceeding with the main body of the 
text, or he may postpone reading this material until later when he reaches those parts of the 
theory that make use of least-Upper-bound properties. Material in the text that depends on 
the least-Upper-bound axiom will be clearly indicated. 

To develop calculus as a complete, formal mathematical theory, it would be necessary 
to state, in addition to the axioms for the real number system, a list of the various “methods 
of proof” which would be permitted for the purpose of deducing theorems from the axioms. 
Every statement in the theory would then have to be justified either as an “established law” 
(that is, an axiom, a definition, or a previously proved theorem) or as the result of applying 
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One of the acceptable methods of proof to an established law. A program of this sort would 
be extremely long and tedious and would add very little to a beginner’s understanding of 
the subject. Fortunately, it is not necessary to proceed in this fashion in order to get a good 
understanding and a good working knowledge of calculus. In this book the subject is 
introduced in an informal way, and ample use is made of geometric intuition whenever it is 
convenient to do so. At the same time, the discussion proceeds in a manner that is con- 
sistent with modern standards of precision and clarity of thought. All the important 
theorems of the subject are explicitly stated and rigorously proved. 

To avoid interrupting the principal flow of ideas, some of the proofs appear in separate 
starred sections. For the same reason, some of the chapters are accompanied by supple- 
mentary material in which certain important topics related to calculus are dealt with in 
detail. Some of these are also starred to indicate that they may be omitted or postponed 
without disrupting the continuity of the presentation. The extent to which the starred 
sections are taken up or not will depend partly on the reader’s background and skill and 
partly on the depth of his interests. A person who is interested primarily in the basic 
techniques may skip the starred sections. Those who wish a more thorough course in 
calculus, including theory as well as technique, should read some of the starred sections. 


Part 2. Some Basic Concepts of the Theory of Sets 


12.1 Introduction to set theory 

In discussing any branch of mathematics, be it analysis, algebra, or geometry, it is helpful 
to use the notation and terminology of set theory. This subject, which was developed by 
Boole and Cantorf in the latter part of the 19th Century, has had a profound influence on the 
development of mathematics in the 20th Century. It has unified many seemingly discon- 
nected ideas and has helped to reduce many mathematical concepts to their logical founda- 
tions in an elegant and systematic way. A thorough treatment of the theory of sets would 
require a lengthy discussion which we regard as outside the scope of this book. Fortunately, 
the basic notions are few in number, and it is possible to develop a working knowledge of the 
methods and ideas of set theory through an informal discussion. Actually, we shall discuss 
not so much a new theory as an agreement about the precise terminology that we wish to 
apply to more or less familiar ideas. 

In mathematics, the word “set” is used to represent a collection of objects viewed as a 
single entity. The collections called to mind by such nouns as “flock,” “tribe,” “crowd,” 
“team,” and “electorate” are all examples of sets. The individual objects in the collection 
are called elements or members of the set, and they are said to belong to or to be contained in 
the set. The set, in turn, is said to contain or be composed of its elements. 


| George Boole (1815-1864) was an English mathematician and logician. His book, An Investigation of the 
Laws of Thought, published in 1854, marked the creation °f the first workable system of symbolic logic. 
Georg F. L. F. Cantor (1845-1918) and his school created the modern theory of sets during the period 
1874-1895. 
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We shall be interested primarily in sets of mathematical objects: sets of numbers, sets of 
curves, sets of geometric figures, and so on. In many applications it is convenient to deal 
with sets in which nothing special is assumed about the nature of the individual objects in 
the collection. These are called abstract sets. Abstract set theory has been developed to deal 
with such collections of arbitrary objects, and from this generality the theory derives its power. 

12.2 Notations for designating sets 

Sets usually are denoted by capital letters : A, B, C, . . . , X, Y, Z; elements are designated 
by lower-case letters: a , b, c x, y, z, We use the special notation 

xsS 

to mean that “x is an element of S” or “x belongs to S.” If x does not belong to S, we write 
x f S. When convenient, we shall designate sets by displaying the elements in braces; for 
example, the set of positive even integers less than 10 is denoted by the symbol (2, 4, 6, 8} 
whereas the set of all positive even integers is displayed as {2, 4, 6, . . .}, the three dots 
taking the place of “and so on.” The dots are used only when the meaning of “and so on” 
is clear. The method of listing the members of a set within braces is sometimes referred to as 
the roster notation. 

The first basic concept that relates one set to another is equality of sets: 

DEFINITION OF SET EQUALITY. Two sets A and B are said to be equal (or identical) if 
they consist of exactly the same elements, in which case we write A = B. If one of the sets 
contains an element not in the other, we say the sets are unequal caul we write A ^ B. 

example 1 . According to this definition, the two sets (2, 4, 6, 8} and (2, 8, 6, 4} are 
equal since they both consist of the four integers 2 4, 6, and 8. Thus when we use the roster 
notation to describe a set, the order in which the elements appear is irrelevant. 

example 2. The sets {2, 4, 6, 8) and {2, 2, 4, 4, 6, 8} are equal even though, in the second 
set, each of the elements 2 and 4 is listed twice. Both sets contain the four elements 2, 4, 6, 8 
and no others; therefore, the definition requires that we call these sets equal. This example 
shows that we do not insist that the objects listed in the roster notation be distinct. A similar 
example is the set of letters in the word Mississippi, which is equal to the set {M, i, s, pj, 
consisting of the four distinct letters M, i, S, and p. 

12.3 Subsets 

From a given set S we may form new sets, called subsets of S. For example, the set 
consisting of those positive integers less than 10 which are divisible by 4 (the set (4, 8)) is a 
subset of the set of all even integers less than 10. In general, we have the following definition. 

definition of a subset . A set A is said to be a subset of a set B , and we write 

A £ B. 

whenever every element of A also belongs to B. We also say that A is contained in B or that B 
contains A. The relation £ is referred to as set inclusion. 
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The statement A c. B does not rule out the possibility that B c A. In fact, we may have 
both A g. B and B £ A, but this happens only if A and B have the same elements. In 
other words, 

A = B if and only if A c B and B c A. 

This theorem is an immediate consequence of the foregoing definitions of equality and 
inclusion. If A c B but A ^ B, then we say that A is aproper subset of B; we indicate this 
by writing A c B. 

In all our applications of set theory, we have a fixed set S given in advance, and we are 
concerned only with subsets of this given set. The underlying set S may vary from one 
application to another ; it will be referred to as the universal set of each particular discourse. 
The notation 

{x x £ S and x satisfies P} 

will designate the set of all elements x in S which satisfy the property P. When the universal 
set to which we are referring is understood, we omit the reference to Sand write simply 
{x | x satisfies P}. This is read “the set of allx such that x satisfies P.” Sets designated in 
this way are said to be described by a defining property. For example, the set of all positive 
real numbers could be designated as {x x > O} ; the universal set S in this case is understood 
to be the set of all real numbers. Similarly, the set of all even positive integers {2, 4, 6, . . .} 
can be designated as (x| x is a positive even integer}. Of course, the letter x is a dummy and 
may be replaced by any other convenient symbol. Thus, we may write 

{x | x > 0} = {y y > 0} = {t t > 0} 

and so on. 

It is possible for a set to contain no elements whatever. This set is called the empty set 
or the void set, and will be denoted by the symbol 0 . We will consider 0 to be a subset of 
every set. Some people find it helpful to think of a set as analogous to a container (such as a 
bag or a box) containing certain objects, its elements. The empty set is then analogous to an 
empty container. 

To avoid logical difficulties, we must distinguish between the element x and the set { x } 
whose only element is x. (A box with a hat in it is conceptually distinct from the hat itself.) 
In particular, the empty set 0 is not the same as the set {0}. In fact, the empty set 0 contains 
no elements, whereas the set { 0 } has one element, 0 . (A box which contains an empty box 
is not empty.) Sets consisting of exactly one element are sometimes called one-element sets. 

Diagrams often help us visualize relations between sets. For example, we may think of a 
set S as a region in the plane and each of its elements as a point. Subsets of S may then be 
thought of as collections of points within S. For example, in Figure 1.6(b) the shaded portion 
is a subset of A and also a subset of B. Visual aids of this type, called Venn diagrams, are 
useful for testing the validity of theorems in set theory or for suggesting methods to prove 
them. Of course, the proofs themselves must rely only on the definitions of the concepts and 
not on the diagrams. 

12.4 Unions, intersections, complements 

From two given sets A and B, we can form a new set called the union of A and B. This 
new set is denoted by the symbol 

A v B (read: “A union B ”) 
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(a)AuS (b )AC\B (c)AnB =0 

Figure 1.6 Unions and intersections. 


and is defined as the set of those elements which are in A, in B, or in both. That is to say, 
A U B is the set of all elements which belong to at least one of the sets A, B. An example is 
illustrated in Figure 1.6(a), where the shaded portion represents A u B. 

Similarly, the intersection of A and B, denoted by 

A C\ B (read: “A intersection s”), 

is defined as the set of those elements common to both A and B. This is illustrated by the 
shaded portion of Figure 1.6(b). In Figure 1.6(c), the two sets A and B have no elements in 
common; in this case, their intersection is the empty set 0 . Two sets A and B are said to be 
disjoint if A n B = 0 . 

If A and B are sets, the difference A —B (also called the complement ofB relative to A) is 
defined to be the set of all elements of A which are not in B. Thus, by definition, 

A — B = {x\x e A and x £ B} . 

In Figure 1.6(b) the unshaded portion of A represents A — B; the unshaded portion of B 
represents B — A. 

The operations of union and intersection have many formal similarities to (as well as 
differences from) ordinary addition and multiplication of real numbers. For example, 
since there is no question of order involved in the definitions of union and intersection, it 
follows that A U B = B U A and that A C\ B = B n A. That is to say, union and inter- 
section are commutative operations. The definitions are also phrased in such a way that the 
operations are associative : 

(A u B) U C = A u (B u C) and (A n B) n C = A n (B n C) . 

These and other theorems related to the “algebra of sets” are listed as Exercises in Section 
1 2.5. One of the best ways for the reader to become familiar with the terminology and 
notations introduced above is to carry out the proofs of each of these laws. A sample of the 
type of argument that is needed appears immediately after the Exercises. 

The operations of union and intersection can be extended to finite or infinite collections 
of sets as follows: Let IF be a nonempty classf of sets. The union of all the sets in ZF is 


| T 0 help simplify the language, we call a collection of sets a class. Capital script letters d , 28, '2, . . . are 
used to denote classes. The usual terminology and notation of set theory applies, of course, to classes. Thus, 
for example, means that A is one of the sets in the class F, and a/ c means that every set in ^ 

is also in 28, and so forth. 
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defined as the set of those elements which belong to at least one of the sets in & and is 
denoted by the symbol 


Da. 


AeP 


If is a finite collection of sets, say 2F = {A, , A A,}, we write 


U A — U A k = Ai u A, u . . . u A, . 

Jei 5 ' 1 


Similarly, the intersection of all the sets in 3F is defined to be the set of those elements 
which belong to every one of the sets in it is denoted by the symbol 


ru. 


For finite collections (as above), we write 


n 

fl A = D A k = A x C\ A^r\- ■ ■ C\ A n . 

lc = 1 

Unions and intersections have been defined in such a way that the associative laws for 
these operations are automatically satisfied. Hence, there is no ambiguity when we write 
A, u A 2 u . .. u A, or A. n A 2 O . < . n A.. 


12.5 Exercises 

1. Use the roster notation to designate the following sets of real numbers. 


A= {x .x 2 - ! = 0} . 


D = {x\x 3 - 2x 2 + x = 2}. 


B = {x (x - l) 2 = 0} . E = {x | (x + 8) 2 = 9 2 } . 

C = {x | * + 8 = 9} . F = {x (x 2 + 16x) 2 = 17 2 } . 


2. For the sets in Exercise 1 , note that B £ A. List all the inclusion relations £ that hold among 
the sets A, B, C, D, E, F. 

3. Let A ={1},B ={1, 2}. Discuss the validity of the following statements (prove the ones that 
are true and explain why the others are not true). 

(a) A c B. (d) 1 £ A. 

(b) A g B. (e) 1 c A. 

(c) A e B. (f) I c B. 

4. Solve Exercise 3 if A = {1} and B = {{1}, 1}. 

5. Given the set 5 = {1, 2, 3, 4). Display all subsets of S. There are 16 altogether, counting 
0 and S. 

6. Given the following four sets 


A = {1,2}, 5 = {{!},{ 2}}, C = {{1},{1, 2}}, D = {{1}, {2},{1, 2}}, 
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discuss the validity of the following statements (prove the ones that are true and explain why 
the others are not true). 

(a) A = B. (d)AeC. (g) ficfl, 

(b) Ac B. (e) A c D. (h) Be D. 

(c) A c c. (f) B <= c. (i) A £ D. 

I. Prove the following properties of set equality. 

(a) {a, a) = {a}. 

(b) {a, b} = {b, a}. 

(c) {a} = {b, c } if and only if a = b = c. 

Prove the set relations in Exercises 8 through 19. (Sample proofs are given at the end of this 
section). 

8. Commutative laws: A u B = B u A, A n B = B n A. 

9. Associative laws: A f J(B'jC) = (AuB)(jC, A n (B a C) = (A n B) n C. 

10. Distributive laws: A r (B vj C) = (A n B) U (A nC>, A u (B n C) = (A u B) n (A u C). 

II. A v A = A, A n A = A, 

12. A c A U B, A r\ B c. A. 

13. A U 0 = A , A n 0 = 0 . 

14. AU(AnfiJ =A, An (A u B) = A. 

15. If A c C and c C, then A u B g C. 

16. If C cA and C £ 8 , then C c A n B. 

17. (a) If A c 5 and 8 cC, prove that A c C. 

(b) If A c 5 and S c C, prove that A c C. 

(c) What can you conclude if Acs and B c C7 

(d) If x e A and A c fi, is it necessarily true that x eB? 

(e) If x e A and A eB. is it necessarily true that x e B ? 

18. A-(BtAC) = (A-B)U(A- C). 

19. Let ^ be a class of sets. Then 

B — [j A — f^\(B — A) and B - f\A = \j(B - A). 


20. (a) Prove that one of the following two formulas is always right and the other one is sometimes 
wrong : 

(i) A - (B - C) = (A ~B) UC, 

(ii) A - (B U C) = (A - B) - C. 


(b) State an additional necessary and sufficient condition for the formula which is sometimes 
incorrect to be always right. 

Proof of the commutative Jaw A U B = B 'U A. Let X = A 'd B, Y = B U A. To 
prove that X = Y we prove that X c Y and Y c X. Suppose that x e X. Then x is 
in at least one of A or B. Hence, x is in at least one of B or A; SO x 6 Y. Thus, every 
element of X is also in Y, so X £ Y. Similarly, we find that Y £ X. so X = Y. 


Proof of A ns cA. If x e A n B, then x is in both A and B. In particular, x 6 A. 
Thus, every element of A n B is also in A; therefore, A n B c A. 




The field axioms 


17 


Part 3. A Set of Axioms for the Real-Number System 


13.1 Introduction 

There are many ways to introduce the real-number system. One popular method is to 
begin with the positive integers 1, 2, 3, , . . and use them as building blocks to construct a 
more comprehensive system having the properties desired. Briefly, the idea of this method 
is to take the positive integers as undefined concepts, state some axioms concerning 
them, and then use the positive integers to build a larger system consisting of the positive 
rational numbers (quotients of positive integers). The positive rational numbers, in turn, 
may then be used as a basis for constructing the positive irrational numbers (real numbers 
like V 2 and tj that are not rational). The final step is the introduction of the negative real 
numbers and zero. The most difficult part of the whole process is the transition from the 
rational numbers to the irrational numbers. 

Although the need for irrational numbers was apparent to the ancient Greeks from 
their study of geometry, satisfactory methods for constructing irrational numbers from 
rational numbers were not introduced until late in the 19th Century. At that time, three 
different theories were outlined by Karl Weierstrass (1815-1897), Georg Cantor (1 845— 
1918), and Richard Dedekind (1831-1916). In 1889, the Italian mathematician Guiseppe 
Peano (1858-1932) listed five axioms for the positive integers that could be used as the 
starting point of the whole construction. A detailed account of this construction, beginning 
with the Peano postulates and using the method of Dedekind to introduce irrational 
numbers, may be found in a book by E. Landau, Foundations of Analysis (New York, 
Chelsea Publishing Go., 1951). 

The point of view we shall adopt here is nonconstructive. We shall start rather far out 
in the process, taking the real numbers themselves as undefined objects satisfying a number 
of properties that we use as axioms. That is to say, we shall assume there exists a set R of 
objects, called real numbers, which satisfy the 10 axioms listed in the next few sections. All 
the properties of real numbers can be deduced from the axioms in the list. When the real 
numbers are defined by a constructive process, the properties we list as axioms must be 
proved as theorems. 

In the axioms that appear below, lower-case letters 3, b, C, ■ ■ . , x, y, z represent arbitrary 
real numbers unless something is said to the contrary. The axioms fall in a natural way into 
three groups which we refer to as the field axioms, the order axioms, and the least-upper- 

bound axiom (also called the axiom of continuity or the completeness axiom). 


13.2 The field axioms 

Along with the set R of real numbers we assume the existence of two operations called 
addition and multiplication, such that for every pair of real numbers x and y we can form the 
sum of x and y, which is another real number denoted by x + y, and the product of x and y, 
denoted by xy or by x . y. It is assumed that the sum x + y and the product xy are uniquely 
determined by X and y. In other words, given x and y, there is exactly one real number 
x + y and exactly one real number xy. We attach no special meanings to the symbols 
+ and . other than those contained in the axioms. 
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AXIOM 1. COMMUTATIVE LAWS. X +y =y + X, xy = yx . 

AXIOM 2. ASSOCIATIVE LAWS. X + (y + z) = (x + y) + Z, x(yz) = (xy)Z- 

AXIOM 3. DISTRIBUTIVE LAW. x(\’ + z) = xy + xz . 

axiom 4. existence of identity elements. There exist two distinct real numbers, which 
we denote by 0 and 1, such that for ecery real x we have x + 0 = x and 1 • x = x. 

axiom 5. existence of negatives. For ecery real number x there is a real number y 
such that x + y = 0. 

axiom 6. existence of reciprocals . For every real number x ^ 0 there is a real 
number y such that xy = 1. 

Note: The numbers 0 and 1 in Axioms 5 and 6 are those of Axiom 4. 

From the above axioms we can deduce all the usual laws of elementary algebra. The 
most important of these laws are collected here as a list of theorems. In all these theorems 
the symbols a, b, c. d represent arbitrary real numbers. 

THEOREM 1,1. CANCELLATION LAW FOR ADDITION. If a + b — a + c, then b = c. (In 

particular, this shows that the number 0 of Axiom 4 is unique.) 

theorem 1.2. possibility OF subtraction. Given a and b, there is exactly one x such 

that a + x = b. This x is denoted by b ~ a. In particular, 0 — a is written simply -a and 

is called the negative of a. 

theorem 1.3. b — a = b + (-a). 

THEOREM 1.4. -(-a) = a. 

THEOREM 1.5. a(b — c) = ab — ac. 

THEOREM 1.6. 0 1 a = a 1 0 = 0. 

THEOREM 1.7. CANCELLATION LAW FOR MULTIPLICATION. If db = 3C BOd 3^0, then 

b = c. (In particular, this shows that the number 1 of Axiom 4 is unique.) 

theorem 1 . 8 . possibility of divisicn. G iven a and b with a ?c0, there is exactly one x 
such that ax = b. This x is denoted by bjaor ^ and is called the quotient ofb and a. In 
particular, 1 /a is also written a -1 and is called the reciprocal of a. 

THEOREM 1.9. If a ^0, then bja = b 'a -1 . 

theorem 1.10. If a t^O, then (u ') ^ = a. 

theorem 1,11. If ab = 0, then a = 0 or b = 0. 

THEOREM 1.12. (-a)b = -(ah) and (- a)(-b ) = ab. 

THEOREM 1.13. (ajb) + ( cjd ) = (ad + bc)j(bd) if b ^ 0 and d ^ 0. 

THEOREM 1. 14. ( a/bXc/d ) = ( ac)/(bd ) if b ^ 0 and d ^ 0. 

1.15. ( alb)l(cld ) = ( ad)l(bc ) if b ^ 0, c ^ 0, and d ^ 0. 


THEOREM 
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To illustrate how these statements may be obtained as consequences of the axioms, we 
shall present proofs of Theorems 1.1 through 1.4. Those readers who are interested may 
find it instructive to carry out proofs of the remaining theorems. 

Proof of 1.1. Given a + b = a + c. By Axiom 5, there is a numbery such that y + a = 0. 
Since sums are uniquely determined, we have y + (a + b) = y + (a + c). Using the 
associative law, we obtain (y + a) + b = (y + a) + c or 0 + b = 0 + c. But by Axiom 4 
we have 0 + b = b and 0 + c = c, so that b — c. Notice that this theorem shows that there 
is only one real number having the property of 0 in Axiom 4. In fact, if 0 and 0’ both have 

this property, then 0 + 0’ =0 and 0 + 0 = 0. Hence 0 + 0’ = 0 + 0 and, by the can- 

cellation law, 0 = 0’. 

Proof of 1.2. Given a and b, choose y so that a + y = 0 and let x = y + b. Then 
a + x = a +(y+ b) = (a + y) + b = 0+ b = b. Therefore there is at least one x 

such that a + x = b. But by Theorem 1.1 there is at most one Such x. Hence there is 
exactly one. 

Proof of 1.3. Let x = b — a and let y = b + (-a). We wish to prove that x = y. 
Now x + a = b (by the definition of b — a) and 

y + a — [b + (- d)\ + a = b + [(— a) + a\ = b + 0 = b. 

Therefore x + a = y + a and hence, by Theorem 1.1, x = y, 

Proof of 1.4. We have a + (-a) = 0 by the definition of -a. But this equation tells us 

that a is the negative of -a. That is, a = -(-a), as asserted. 

*1 3.3 Exercises 

1. Prove Theorems 1.5 through 1.15, using Axioms 1 through 6 and Theorems 1.1 through 1.4. 

In Exercises 2 through 10, prove the given statements or establish the given equations. You 
may use Axioms 1 through 6 and Theorems 1.1 through 1.15. 

2 . -0 = 0 . 

3. I- 1 = 1. 

4. Zero has no reciprocal. 

5. -(a + b) = -a — b. 

6. -(a — b) = -a + b. 

7. (a — b) + (b — c) = a — c. 

8. If a 0 and b 0, then (ab) -1 = 

9. -(alb) = (-alb) = a/( -b) if b * 0. 

10. {alb) - ( c/d ) = (ad - bc)l(bd) if b^O and d^0. 


13.4 The order axioms 

This group of axioms has to do with a concept which establishes an ordering among the 
real numbers. This ordering enables us to make statements about one real number being 
larger or smaller than another. We choose to introduce the order properties as a set of 
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axioms about a new undefined concept called positiveness and then to define terms like 
less than and greater than in terms of positiveness. 

We shall assume that there exists a certain subset R+ c R, called the set of positive 
numbers, which satisfies the following three order axioms : 

AXIOM 7. If x and y are in R+, so arex + y and xy. 

axiom 8 . For every real x^O, either x e R~ or —x e R, but not both. 

AXIOM 9 . 0 ^ R + . 

Now we can define the symbols <C, !>, <, and >, called, respectively, less than, greater 
than, less than or equal to, and greater than or equal to, as follows: 

x < y means that y — x is positive; 
y > x means that x < y; 
x < y means that either x < y or x = y; 
y > x means that x < y. 

Thus, we have x > 0 if and only if x is positive. If x < 0, we say that X is negative; if 
x > 0, we say that X is nonnegative. A pair of simultaneous inequalities such as x < y, 
y < z is usually written more briefly as x < y < z; similar interpretations are given to the 
compound inequalities x < y < z, x < y < z, and x < y < z. 

From the order axioms we can derive all the usual rules for calculating with inequalities. 
The most important of these are listed here as theorems. 

theorem 1,16. trichotomy law. For arbitrary real numbers a and b, exactly one of 

the three relations a < b, b < a, a = b holds. 


THEOREM 

1.17. 

TRANSITIVE LAW. Zf 3 <6 d/ltffo <[ C, t/lfifl 3 < C. 

THEOREM 

1.18. 

If a < b, then a + c < b + c. 

THEOREM 

1.19. 

If a < b and c > 0, then ac < be. 

THEOREM 

1.20. 

If a t* 0, then cr > 0. 

THEOREM 

1.21. 

1 > 0. 

THEOREM 

1.22. 

If a<b and c < 0, then ac > be. 

THEOREM 

1.23. 

if a < b, then -a > -b. Znparticular, fa < 0, then -a > 0. 

THEOREM 

1.24. 

if ab > 0, then both a and b are positive or both are negative. 

THEOREM 

1.25. 

if a < c and b <d, then a + b <c + d. 


Again, we shall prove only a few of these theorems as samples to indicate how the proofs 
may be carried out, Proofs of the others are left as exercises. 
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Proof of 1.16. Let x = b — a. If x = 0, then b — a = a — b = 0, and hence, by Axiom 
9, we cannot have a > b or b > a. If x ^ 0, Axiom 8 tells us that either x > 0 or x < 0, 
but not both; that is, either a < b or b < a, but not both. Therefore, exactly one of the 
three relations, a = b, a < b, b < a, holds. 

Proof of 1.17. If a < b and b < c, then b — a > 0 and c — b > 0. By Axiom 7 we may 
add to obtain (b - a) + ( c - b) > 0. That is, c — a > 0, and hence a < C. 

Proof of 1.18. Let x = a + c, y = b + c. Then y — x = b — a. But b — a > 0 since 

a < b. Hence y — x > 0, and this means that x < y. 

Proof of 1.19. If a < b, then b — a > 0. If c > 0, then by Axiom 7 we may multiply 

c by (b — a) to obtain (b - a)c > 0. But (b — a)c = be — ac. Hence be — ac > 0, and 

this means that ac < be, as asserted. 

Proof of 1.20. If a > 0, then a 1 a > 0 by Axiom 7. If a < 0, then -a > 0, and hence 
{-a) > (-a) > 0 by Axiom 7. In either case we have a 2 > 0. 

Proof of 1.21. Apply Theorem 1.20 with a = 1. 


*1 3.5 Exercises 

1. Prove Theorems 1.22 through 1.25, using the earlier theorems and Axioms I through 9. 

In Exercises 2 through 10, prove the given statements or establish the given inequalities. You 
may use Axioms 1 through 9 and Theorems 1.1 through 1.25. 

2. There is no real number x such that x 2 + 1 = 0. 

3. The sum of two negative numbers is negative. 

4. If a > 0, then 1/a > 0; if a < 0, then 1 /a < 0. 

5. If 0 < a < b, then 0 < b~ l < fl _1 , 

6. Ifu < b and b <, c, then a < c. 

7. Ifu < b and b <, c, and a =c,thenb = c. 

8. For all real a and b we have a 2 + b 2 > 0. If a and b are not both 0, then a 2 + b 2 > 0. 

9. There is no real number a such that x < a for all real x. 

10. If x has the property that 0 <, x < h for every positive real number h, then x = 0. 

13.6 Integers and rational numbers 

There exist certain subsets of R which are distinguished because they have special prop- 
erties not shared by all real numbers. In this section we shall discuss two such subsets, the 

integers and the rational numbers. 

To introduce the positive integers we begin with the number 1, whose existence is guar- 
anteed by Axiom 4. The number 1 + 1 is denoted by 2, the number 2 + 1 by 3, and so on. 
The numbers 1, 2, 3, ... , obtained in this way by repeated addition of 1 are all positive, 
and they are called the positive integers. Strictly speaking, this description of the positive 
integers is not entirely complete because we have not explained in detail what we mean by 
the expressions “and so on,” or “repeated addition of 1.” Although the intuitive meaning 
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of these expressions may seem clear, in a careful treatment of the real-number system it is 
necessary to give a more precise definition of the positive integers. There are many ways 
to do this. One convenient method is to introduce first the notion of an inductive set. 

definition of an inductive set. A set of real numbers is called an inductive set if it has 
the following two properties: 

(a) The number 1 is in the set. 

(b) For every x in the set, the number x + 1 is also in the set. 


For example, R is an inductive set. So is the set R+. Now we shall define the positive 
integers to be those real numbers which belong to every inductive set. 


definition of positive integers. A real number is called a positive integer if it belongs 
to every inductive set. 

Let P denote the set of all positive integers. Then P is itself an inductive set because (a) 
it contains 1, and (b) it contains X + 1 whenever it contains x. Since the members of P 
belong to every inductive set, we refer to P as the smallest inductive set. This property of 
the set P forms the logical basis for a type of reasoning that mathematicians call proof by 
induction, a detailed discussion of which is given in Part 4 of this Introduction. 

The negatives of the positive integers are called the negative integers. The positive integers, 
together with the negative integers and 0 (zero), form a set Z which we call simply the 
set of integers. 

In a thorough treatment of the real-number system, it would be necessary at this stage to 
prove certain theorems about integers. For example, the sum, difference, or product of two 
integers is an integer, but the quotient of two integers need not be an integer. However, we 
shall not enter into the details of such proofs. 

Quotients of integers ajb (where b ^ 0) are called rational numbers. The set of rational 
numbers, denoted by Q, contains Z as a subset. The reader should realize that all the field 
axioms and the order axioms are satisfied by Q. For this reason, we say that the set of 
rational numbers is an ordered field. Real numbers that are not in Q are called irrational. 


13.7 Geometric interpretation of real numbers as points on a line 

The reader is undoubtedly familiar with the geometric representation of real numbers 
by means of points on a straight line. A point is selected to represent 0 and another, to the 
right of 0, to represent 1, as illustrated in Figure 1.7. This choice determines the scale. 
If One adopts an appropriate set of axioms for Euclidean geometry, then each real number 
corresponds to exactly one point on this line and, conversely, each point on the line corre- 
sponds to one and only one real number. For this reason the line is often called the real line 
or the real axis, and it is customary to use the words real number and point interchangeably. 
Thus we often speak of the point x rather than the point corresponding to the real number x. 

The ordering relation among the real numbers has a simple geometric interpretation. 
If x < y, the point x lies to the left of the point y, as shown in Figure 1.7. Positive numbers 
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lie to the right of 0 and negative numbers to the left of 0. If a < b, a point x satisfies the 
inequalities a < x < b if and only if X is between a and b. 

This device for representing real numbers geometrically is a very worthwhile aid that 
helps us to discover and understand better certain properties of real numbers. However, 
the reader should realize that all properties of real numbers that are to be accepted as 
theorems must be deducible from the axioms without any reference to geometry. This 
does not mean that one should not make use of geometry in studying properties of real 
numbers. On the contrary, the geometry often suggests the method of proof of a particular 
theorem, and sometimes a geometric argument is more illuminating than a purely analytic 
proof (one depending entirely on the axioms for the real numbers). In this book, geometric 

1 1 

0 1 X Y 

Figure 1.7 Real numbers represented geometrically on a line. 

arguments are used to a large extent to help motivate or clarify a particular discussion. 
Nevertheless, the proofs of all the important theorems are presented in analytic form. 


13.8 Upper bound of a set, maximum element, least upper bound (supremum) 

The nine axioms listed above contain all the properties of real numbers usually discussed 
in elementary algebra. There is another axiom of fundamental importance in calculus that 
is ordinarily not discussed in elementary algebra courses. This axiom (or some property 
equivalent to it) is used to establish the existence of irrational numbers. 

Irrational numbers arise in elementary algebra when we try to solve certain quadratic 
equations. For example, it is desirable to have a real number x such that x 2 = 2. From the 
nine axioms above, we cannot prove that such an x exists in R, because these nine axioms 
are also satisfied by Q, and there is no rational number x whose square is 2. (A proof of this 
statement is outlined in Exercise 11 of Section 1 3.12.) Axiom 10 allows us to introduce 
irrational numbers in the real-number system, and it gives the real-number system a property 
of continuity that is a keystone in the logical structure of calculus. 

Before we describe Axiom 10, it is convenient to introduce some more terminology and 
notation. Suppose S is a nonempty set of real numbers and suppose there is a number 8 
such that 

x<B 

for every x in S. Then Sis said to be bounded above by B. The number 8 is called an upper 
bound for S. We say an upper bound because every number greater than B will also be an 
upper bound. If an upper bound 8 is also a member of S, then B is called the largest 
member or the maximum element of S. There can be at most one such B. If it exists, we 

write 

B = m a x S . 

Thus, 8 = max 5 if 8 e S and x < 8 for all X in S. A set with no upper bound is said to be 

unbounded above. 

The following examples serve to illustrate the meaning of these terms. 
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example 1. Let S be the set of all positive real numbers. This set is unbounded above. 
It has no upper bounds and it has no maximum element. 

example 2. Let S be the set of all real x satisfying 0 < x < L This set is bounded 
above by 1. In fact, 1 is its maximum element. 

example 3. Let T be the set of all real x satisfying 0 < x < 1. This is like the set in 
Example 2 except that the point 1 is not included. This set is bounded above by 1 but it has 
no maximum element. 

Some sets, like the one in Example 3, are bounded above but have no maximum element. 
For these sets there is a concept which takes the place of the maximum element. This is 
called the least upper bound of the set and it is defined as follows: 

DEFINITION OF LEAST UPPER BOUND. A number B is called a least upper bound of a 

nonempty set S if B has the following two properties: 

(a) B is an upper boundfor S. 

(b) No number less than B is an upper boundfor S. 


If S has a maximum element, this maximum is also a least upper bound for S. But if S 
does not have a maximum element, it may still have a least upper bound. In Example 3 
above, the number 1 is a least upper bound for T although T has no maximum element. 
(See Figure 1.8.) 


S 

/ 


/ 


Upper bounds for 


1 

^ Largest member of S 


S 


0 


T 


Upper bounds for T 
y //////// ///////////////, 


''Least upper bound of T 


(a) S has a largest member: (b) T has no largest member, but it has 

max 5=1 a least upper bound: sup T = 1 

Figure 1.8 Upper bounds, maximum element, supremum. 
theorem 1.26. T wo different numbers cannot be least upper bounds for the same set. 

Proof Suppose that B and C are two least upper bounds for a set S. Property (b) 
implies that C >6 since B is a least upper bound; similarly, B > C since C is a least upper 
bound. Hence, we have 6 s= C. 

This theorem tells us that if there is a least upper bound for a set S, there is only one and 
we may speak of the least upper bound. 

It is common practice to refer to the least upper bound of a set by the more concise term 
supremum, abbreviated sup. We shall adopt this convention and write 

B = sup S 

to express the fact that B is the least upper bound, or supremum, of S. 
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13.9 The least-Upper-bound axiom (completeness axiom) 

Now we are ready to state the least-Upper-bound axiom for the real-number system. 

axiom 10. Every nonempty set S °j real numbers which is bounded above has a supremum; 
that is, there is a real number B such that B = sup S. 

We emphasize once more that the supremum of S need not be a member of S. In fact, 
sup S belongs to S if and only if S has a maximum element, in which case max S ;= sup S. 

Definitions of the terms lower bound, bounded below, smallest member (or minimum 
element) may be similarly formulated. The reader should formulate these for himself. If 
S has a minimum element, we denote it by min S. 

A number L is called a greatest lower bound (or infimuni) of S if (a) L is a lower bound for 
S, and (b) no number greater than L is a lower bound for S. The infimum of S, when it 
exists, is uniquely determined and we denote it by inf S. If S has a minimum element, then 
min S = inf S. 

Using Axiom 10, we can prove the following. 

theorem 1.27. Every nonempty set S that is bounded below has a greatest lower bound; 
that is, there is a real number L such that L = inf S. 


Proof. Let —S denote the set of negatives of numbers in S. Then —S is nonempty and 
bounded above. Axiom 10 tells us that there is a number B which is a supremum for — S. 

It is easy to verify that -B = inf S. 

Let us refer once more to the examples in the foregoing section. In Example 1, the set of 
all positive real numbers, the number 0 is the infimum of S. This set has no minimum 
element. In Examples 2 and 3, the number 0 is the minimum element. 

In all these examples it was easy to decide whether or not the set S was bounded above 
or below, and it was also easy to determine the numbers sup S and inf S. The next example 
shows that it may be difficult to determine whether upper or lower bounds exist. 

example 4. Let $ be the set of all numbers of the form (1 + \jn) n , where n = 1, 2 , 3 , • • • • 
For example, taking n = 1, 2, and 3, we find that the numbers 2, f, and || are in S. 
Every number in the set is greater than 1, so the set is bounded below and hence has an 
infimum. With a little effort we can show that 2 is the smallest element of S SO inf S = 
min S — 2. The set S is also bounded above, although this fact is not as easy to prove. 
(Try it!) Once we know that S is bounded above, Axiom 10 tells us that there is a number 
which is the supremum of S. In this case it is not easy to determine the value of sup S from 
the description of S. In a later chapter we will learn that sup 5 is an irrational number 
approximately equal to 2.718. It is an important number in calculus called the Euler 
number e. 

13.10 The Archimedean property of the real-number system 

This section contains a number of important properties of the real-number system which 
are consequences of the least-Upper-bound axiom. 
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theorem 1.28. The set P of positive integers 1,2,3,... is unbounded above. 

Proof. Assume P is bounded above. We shall show that this leads to a contradiction. 
Since P is nonempty, Axiom 10 tells us that P has a least upper bound, say b. The number 
b — 1, being less than b, cannot be an upper bound for P . Hence, there is at least one 
positive integer n Such that n > b — 1. For this n we have n + 1 > b. Since n + 1 is in 
P , this contradicts the fact that b is an upper bound for P . 

As corollaries of Theorem 1.28, we immediately obtain the following consequences: 

theorem 1.29. For every real x there exists a positive integer n such that n >x. 

Proof. If this were not so, some x would be an upper bound for P , contradicting 
Theorem 1.28. 


theorem 1.30. If x > 0 and ify is an arbitrary real number, there exists a positive integer 
n such that nx > y. 

Proof. Apply Theorem 1.29 with x replaced by yjx. 

The property described in Theorem 1.30 is called the Archimedean property of the real- 
number system. Geometrically it means that any line segment, no matter how long, may 
be covered by a finite number of line segments of a given positive length, no matter how 
small. In other words, a small ruler used often enough can measure arbitrarily large 
distances. Archimedes realized that this was a fundamental property of the straight line 
and stated it explicitly as one of the axioms of geometry. In the 19th and 20th centuries, 
non-Archimedean geometries have been constructed in which this axiom is rejected. 

From the Archimedean property, we can prove the following theorem, which will be 
useful in our discussion of integral calculus. 

theorem 1.31. If three real numbers a, x, and y satisfy the inequalities 
(1.14) a < x < a + ^ 

for every integer n >1, then x = a. 

Proof If x > a, Theorem 1.30 tells us that there is a positive integer n satisfying 
n(x — a) > y, contradicting (1.14). Hence we cannot have x > 3, so we must have x= a - 

13.11 Fundamental properties of the supremum and infimum 

This section discusses three fundamental properties of the supremum and infimum that 
we shall use in our development of calculus. The first property states that any set of numbers 
with a supremum contains points arbitrarily close to its supremum; similarly, a set with an 
infimum contains points arbitrarily close to its infimum. 
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THEOREM 1-32. Let h be a given positive number and let S be a set of real numbers. 

(a) If S has a supremum , then for some x in S we have 

x>supS-h. 

(b) IfS has an injmum, then for some x in S we have 

x < i n f S + h . 

Proof of (a). If we had x < sup S h for all x in S, then sup S — h would be an upper 
bound for S smaller than its least upper bound. Therefore we must have x > sup S — h 
for at least one x in S. This proves (a). The proof of(b) is similar. 

theorem 1.33. additive property. Given nonempty subsets A and B of R, let C denote 

the set 

C {a -f~ b [ a (E A, b £ Bj . 

(a) If each of A and B has a supremum, then C has a supremum, and 

sup C = sup A + sup B . 

(b) If each of A and B has an injmum, then C has an infimum, and 

inf C= inf A + inf B . 

Proof. Assume each of A and B has a supremum. If c 6 C, then c = a + b, where 
a e A and b r B. Therefore c < sup A + sup B; so sup A + sup Bis an upper bound for C. 
This shows that C has a supremum and that 

sup C < sup A + sup B . 

Now let n be any positive integer. By Theorem 1.32 (with h 
b in B such that 

a > sup A — 1 , b > sup B 

Adding these inequalities, we obtain 

2 2 2 
a + b > sup A + sup B — o r sup A + sup 5<a + h + -< sup C + ~ , 

since a + b < sup C. Therefore we have shown that 


= 1 /«) there is an a in A and a 

_ 1 
n ‘ 


sup C < sup A + sup B < sup C + - 
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for every integer n > 1. By Theorem 1.31, we must have sup C = sup A + sup 6 . This 
proves (a), and the proof of(b) is similar. 

THEOREM 1 . 34. Given two nonempty subsets S and T of R such that 

s < t 

for every s in S and every t in T. Then S has a supremum, and T has an infimum, and they 
satisfy the inequality 


sup S < inf T . 

Proof. Each t in T is an upper bound for S. Therefore S has a supremum which satisfies 
the inequality sup S < t for all t in T. Hence sup S is a lower bound for T, so T has an 
infimum which cannot be less than sup S. In other words, we have sup S < inf T, as 
asserted. 

*1 3.12 Exercises 

1. If x and y are arbitrary real numbers with x < y, prove that there is at least one real z satisfying 
x < z < y . 

2. If x is an arbitrary real number, prove that there are integers m and n such that m <X < n. 

3. If x > 0, prove that there is a positive integer n such that 1 jn < x. 

4. If x is an arbitrary real number, prove that there is exactly one integer n which satisfies the 
inequalities n < x < n + 1. This n is called the greatest integer in x and is denoted by [x]. 
For example, [5] = 5, [f] = 2, [— f] = -3. 

5. If x is an arbitrary real number, prove that there is exactly one integer n which satisfies 
x <, n < x + 1. 

6. If x and y are arbitrary real numbers, x < y, prove that there exists at least one rational num- 
ber r satisfying x < Y < y, and hence infinitely many. This property is often described by 
saying that the rational numbers are drnse in the real-number system. 

7. If x is rational, x jt 0, and y irrational, prove that x + y, x -y, xy, x jy, and yjx are all 
irrational. 

8. Is the sum or product of two irrational numbers always irrational? 

9. If x and y are arbitrary real numbers, x <y, prove that there exists at least one irrational 
number z satisfying x < z < y, and hence infinitely many. 

10. An integer n is called even if n = 2m for some integer m, and odd if n + 1 is even. Prove the 
following statements : 

(a) An integer cannot be both even and odd. 

(b) Every integer is either even or odd. 

(c) The sum or product of two even integers is even. What can you say about the sum or 
product of two odd integers? 

(d) If n 2 is even, so is n If a 2 = 2b 2 , where a and b are integers, then both a and b are even. 

(e) Every rational number can be expressed in the form alb, where a and b are integers, at 
least one of which is odd. 

1 1 . Prove that there is no rational number whose square is 2. 

[Hint: Argue by contradiction. Assume (a/b) 2 = 2, where a and b are integers, at least 
one of which is odd. Use parts of Exercise 10 to deduce a contradiction.] 
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12. The Archimedean property of the real-number system was deduced as a consequence of the 
least-Upper-bound axiom. Prove that the set of rational numbers satisfies the Archimedean 
property but not the least-Upper-bound property. This shows that the Archimedean prop- 
erty does not imply the least-Upper-bound axiom. 


*1 3.13 Existence of square roots of nonnegative real numbers 

It was pointed out earlier that the equation x% = 2 has no solutions among the rational 
numbers. With the help of Axiom 10, we can prove that the equation x 2 = a has a solution 
among the real numbers if a > 0. Each such x is called a square root of a. 

First, let us see what we can say about square roots without using Axiom 10. Negative 
numbers cannot have square roots because if x 2 = a, then a, being a square, must be 
nonnegative (by Theorem 1.20). Moreover, if a = 0, then x = 0 is the only square root 
(by Theorem 1.11). Suppose, then, that a > 0. If X 2 = a, then x ^ 0 and (— x) 2 = a, 
so both x and its negative are square roots. In other words, if a has a square root, then it 
has two square roots, one positive and one negative. Also, it has at most two because 
if x 2 = a and y 2 = a, then x 2 = y 2 and (x — y)(x + y) = 0, and SO, by Theorem 1.11, 
either x = y or x = -y. Thus, if a has a square root, it has exactly two. 

The existence of at least one square root can be deduced from an important theorem 
in calculus known as the intermediate-value theorem for continuous functions, but it 
may be instructive to see how the existence of a square root can be proved directly from 
Axiom 10. 


theorem 1.35. Every nonnegatioe real number a has a unique nonnegative square root. 

Note: If a > 0, we denote its nonnegative square root by a 1 / 2 or by Va. If a > 0, 

the negative square root is — a 1/2 or — Va- 

Proof. If a = 0, then 0 is the only square root. Assume, then, that a > 0. Let S be 
the set of all positive x such that x 2 < a. Since (1 + a) 2 > a, the number 1 + a is an 
upper bound for S. Also, S is nonempty because the number <3/(1 + a) is in S', in fact, 
a 2 < a(l + a) 2 and hence a 2 j(l + a) 2 < a. By Axiom 10, S has a least upper bound 
which we shall call b. Note that b ^ a/(l + a) so b > 0. There are only three possibilities: 
b 2 > a, b 2 < a, or b 2 = a. 

Suppose b 2 > a and let c = b — ( b 2 — a)j(2b) = \(b + a/b). Then 0 < c < b and 
c 2 = b 2 — ( b 2 — a) + (b 2 — a) 2 j(4b 2 ) = a + (b 2 »» a) 2 j(4b 2 ) > a. Therefore c 2 > x 2 
for each x in S, and hence c > x for each x in S. This means that c is an upper bound for 
S. Since C < b, we have a contradiction because b was the least upper bound for S. 
Therefore the inequality b 2 > a is impossible. 

Suppose b 2 < a. Since b > 0, we may choose a positive number c such that c < b and 
such that c < (3 — b 2 )/(3b). Then we have 

(jr) + c) 2 = h 2 + c(2b + c ) < b 2 + 3 be < b 2 + (a - b) _ a 

Therefore b + c is in S. Since b + c > b, this contradicts the fact that b is an Upper 
bound for S. Therefore the inequality b 2 < a is impossible, and the only remaining 
alternative is b 2 = a. 
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*13.14 Roots of higher order. Rational powers 

The least-Upper-bound axiom can also be used to show the existence of roots of higher 
order. For example, if ft is a positive odd integer, then for each real x there is exactly 
one real y such that y” = x. This y is called the nth root of x and is denoted by 

(1.15) y = x lln or y = Vx , 

When n is evert, the situation is slightly different. In this case, if x is negative, there is no 
real y such that y n = x because y n > 0 for all real y. However, if x is positive, it can be 
shown that there is one and only one positive y such that y n = x. This y is called thepositive 
nth root of X and is denoted by the symbols in (1.15). Since n is even, (— y) n = y” and hence 
each x > 0 has two real nth roots, y and -y. However, the symbols x x ! n and $ x are 
reserved for the positive nth root. We do not discuss the proofs of these statements here 
because they will be deduced later as consequences of the intermediate-value theorem for 
continuous functions (see Section 3.10). 

If r is a positive rational number, say r = mjn, where m and n are positive integers, we 
define x r to be ( x m ) 1 /”, the nth root of x m , whenever this exists. If x 0, we define x~ r = 
l/x r whenever x r is defined. From these definitions, it is easy to verify that the usual laws 
of exponents are valid for rational exponents : x r 1 X s = x r+s , {x r f = x rs , and (xy)’ = x r y r . 


*1 3.15 Representation of real numbers by decimals 

A real number of the form 


(1.16) 


r = a 4- — 4- 4 - ••• 4 - ihi 

° + 10 + 10 2 + + 10 ” 


where a„ is a nonnegative integer and a„ a., , . . . , a, are integers satisfying 0 < a, < 9, is 
usually written more briefly as follows: 


r = a^.a^ < << a, . 

This is said to be a finite decimal representation of r. For example, 

i = J) = 0.5 ’ -4 J-10 2 = 0 02 ’ %=]+-+—= 7.25 , 

10 10 2 

Real numbers like these are necessarily rational and, in fact, they all have the form r = a/ 10”, 
where a is an integer. However, not all rational numbers can be expressed with finite 
decimal representations. For example, if J could be so expressed, then we would have 
= aj 10” or 3a = 10 ” for some integer a. But this is impossible since 3 is not a factor of any 
power of 10 . 

Nevertheless, we can approximate an arbitrary real number x > 0 to any desired degree 
of accuracy by a sum of the form (1.16) if we take n large enough. The reason for this may 
be seen by the following geometric argument: If x is not an integer, then x lies between two 
consecutive integers, say a, < x < a, + 2. The segment joining a, and a, + 1 may be 
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subdivided into ten equal parts. If x is not one of the subdivision points, then x must lie 
between two consecutive subdivision points. This gives us a pair of inequalities of the form 


a 0 + ^ < x < a 0 + 


fli + 1 
10 ’ 


where a, is an integer (0 < a, < 9). Next we divide the segment joining a, + a x /lO and 
a„ + (a, + 1)/10 into ten equal parts (each of length 10 -2 ) and continue the process. If 
after a finite number of steps a subdivision point coincides with x, then x is a number of the 
form (1.16). Otherwise the process continues indefinitely, and it generates an infinite set of 
integers a, , a 2 / / ■ ■ ■ ■ In this case, we say that x has the infinite decimal representation 


x = ao.a&cia ■ ■ • . 


At the nth stage, x satisfies the inequalities 


2 0 + - + — <x<a 0 + - 1 + 

10 10 " 10 


G n + 1 

10 ’' ' 


This gives us two approximations to x, one from above and one from below, by finite 
decimals that differ by 10“". Therefore we can achieve any desired degree of accuracy in 
our approximations by taking n large enough. 

When x = I, it is easy to verify that a, s= 0 and a, = 3 for all ri > 1, and hence the 
corresponding infinite decimal expansion is 

i = 0.333 ' < ■ . 

Every irrational number has an infinite decimal representation. For example, when x = V2 
we may calculate by trial and error as many digits in the expansion as we wish. Thus, V2 
lies between 1.4 and 1.5, because (1 ,4) 2 < 2 < (1.5) 2 . Similarly, by squaring and com- 
paring with 2, we find the following further approximations: 

1.41 < V2 < 1.42, 1.414 < Vl < 1.415) 1.4142 < V2 < 1.4143. 


Note that the foregoing process generates a succession of intervals of lengths 10 -1 , 10” 2 , 
10“ 3 , . . . , each contained in the preceding and each containing the point x. This is an 
example of what is known as a sequence of nested intervals, a concept that is sometimes used 
as a basis for constructing the irrational numbers from the rational numbers. 

Since we shall do very little with decimals in this book, we shall not develop their prop- 
erties in any further detail except to mention how decimal expansions may be defined 
analytically with the help of the least-Upper-bound axiom. 

If x is a given positive real number, let a, denote the largest integer < x Having chosen 
a, , we let a, denote the largest integer such that 


a 0 + - 1 < x. 

10 
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More generally, having chosen a, , a, , . . . , a n _ j , we let a, denote the largest integer such 
that 


(1.17) 


d 1 do 

+ To + + ' 



Let S denote the set of all numbers 


(U8) 


‘• + 3 + $ + - + ‘ 


io M 


obtained in this way for n == 0, 1,2,.... Then S is nonempty and bounded above, and 
it is easy to verify that * is actually the least upper bound of S. The integers a„ a x , a 2 , ■ ■ ■ 
so obtained may be used to define a decimal expansion of x if we write 


X — 1 ' 1 

to mean that the nth digit a, is the largest integer satisfying (1.17). For example, if x = 
we find a, = 0, a, = 1, a, = 2, a 3 = 5, and a, = 0 for all n > 4. Therefore we may write 

i — 0.125000 - - ■ . 


If in (1.17) we replace the inequality sign < by <, we obtain a slightly different definition 
of decimal expansions. The least upper bound of all numbers of the form (1.18) is again x, 
although the integers a, , a„ a 2 , . . . need not be the same as those which satisfy (1.17). For 
example, if this second definition is applied to x = we find a, = 0, a, = 1, a 2 = 2, 
a 3 = 4, and a, = 9 for all n > 4. This leads to the infinite decimal representation 

i = 0.124999 ■ " . 

The fact that a real number might have two different decimal representations is merely a 
reflection of the fact that two different sets of real numbers can have the same supremum. 


Part 4. Mathematical Induction, Summation Notation, and 

Related Topics 


14.1 An example of a proof by mathematical induction 

There is no largest integer because when we add 1 to an integer k, we obtain k + 1, 
which is larger than k. Nevertheless, starting with the number 1, we can reach any positive 
integer whatever in a finite number of steps, passing successively from k to k + 1 at each 
step. This is the basis for a type of reasoning that mathematicians call proof by induction. 
We shall illustrate the use of this method by proving the pair of inequalities used in Section 
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11.3 in the computation of the area of a parabolic segment, namely 

(1.19) l 2 + 2 2 + • • • + (n - l) 2 < j < l 2 + 2 2 + n 2 . 

Consider the leftmost inequality first, and let us refer to this formula as A(n) (an assertion 
involving «). It is easy to verify this assertion directly for the first few values of n. Thus, 
for example, when n takes the values 1, 2, and 3, the assertion becomes 

A( 1): 0 < - , A(2): l 2 < - , A(3): l 2 + 2 2 < ^ , 

3 3 3 

provided we agree to interpret the sum on the left as 0 when n = 1. 

Our object is to prove that A(n) is true for every positive integer n, The procedure is as 
follows: Assume the assertion has been proved for a particular value of n, say for n = k. 
That is, assume we have proved 


A(k): l 2 + 2 2 + . . . + (k - 1)” < ~ 

for a fixed k > 1. Now using this, we shall deduce the corresponding result for k + 1: 

A(k + 1): l 2 + 2 2 + • • • + fc 2 < <k — 1) 3 . 

J 

Start with A(k) and add k 2 to both sides. This gives the inequality 


l 2 + 2 2 + • 


. + k 2 < 



To obtain A(k + 1) as a consequence of this, it suffices to show that 


? + k ><(JL±ll 


But this follows at once from the equation 

(k +J l) 3 _ fc 3 + 3 k 2 f 3k + 1 


\ 

= - + k 2 + k + ~. 
3 3 


Therefore we have shown that A(k + 1) follows from A(k). Now, since A(l) has been 
verified directly, we conclude that A(2) is also true. Knowing that A(2) is true, we conclude 
that A(3) is true, and so on. Since every integer can be reached in this way, A(n) is true for 
all positive integersn. This proves the leftmost inequality in (1.19). The rightmost inequality 
can be proved in the same way. 
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14.2 The principle of mathematical induction 

The reader should make certain that he understands the pattern of the foregoing proof. 
First we proved the assertion A(n) for n = 1. Next we showed that if the assertion is true 
for a particular integer, then it is also true for the next integer. From this, we concluded 
that the assertion is true for all positive integers. 

The idea of induction may be illustrated in many nonmathematical ways. For example, 
imagine a row of toy soldiers, numbered consecutively, and suppose they are so arranged 
that if any one of them falls, say the one labeled k, it will knock over the next one, labeled 
k + 1. Then anyone can visualize what would happen if soldier number 1 were toppled 
backward. It is also clear that if a later soldier were knocked over first, say the one labeled 
, then all soldiers behind him would fall. This illustrates a slight generalization of the 
method of induction which can be described in the following way. 

Method of proof by induction. Let A(n ) be an assertion involving an integer n. We 
conclude that A(n) is true for every n > if we can perform the following two steps: 

(a) Prove that A(n, ) is true. 

(b) Let k be an arbitrary but fixed integer . Assume that A(k) is true and prove that 
A{k + 1) is also true. 

In actual practice is usually 1. The logical justification for this method of proof is the 
following theorem about real numbers. 

theorem 1.36. principle of MATHEMATICAL induction. Let S be a set ofpositive 
integers which has the following t wo properties: 

(a) The number 1 is in the set S. 

(b) If an integer k is in S, then SO is k + 1. 

Then every positive integer is in the set S. 

Proof. Properties (a) and (b) tell us that S is an inductive set. But the positive integers 
were defined to be exactly those real numbers which belong to every inductive set. (See 
Section 1 3.6.) Therefore S contains every positive integer. 

Whenever we carry out a proof of an assertion A(n) for all n > 1 by mathematical induc- 
tion, we are applying Theorem 1.36 to the set S of all the integers for which the assertion is 
true. If we want to prove that A(n) is true only for n > , we apply Theorem 1.36 to the 

set of n for which A(n + n x — 1) is true. 

*1 4.3 The well-ordering principle 

There is another important property of the positive integers, called the well-ordering 
principle, that is also used as a basis for proofs by induction. It can be stated as follows. 

theorem 1.37. well-ordering principle. Every nonempty set of positive integers 
contains a smallest member. 

Note that the well-ordering principle refers to sets of positive integers. The theorem is 
not true for arbitrary sets of integers. For example, the set of all integers has no smallest 
member . 
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The well-ordering principle can be deduced from the principle of induction. This is 
demonstrated in Section 14.5. We conclude this section with an example showing how the 
well-ordering principle can be used to prove theorems about positive integers. 

Let A(n) denote the following assertion: 

A(n): l 2 + 2 2 + . . • + n 2 = - + - + - . 

3 2 6 

Again, we note that A(l) is true, since 


12 _ 1 i 1 i 1 
1 — 3 T 2 T 6 • 

Now there are only two possibilities. We have either 

(i) A(n) is true for every positive integer n. or 

(ii) there is at least one positive integer n for which A(n) is false. 

We shall prove that alternative (ii) leads to a contradiction. Assume (ii) holds. Then by 
the well-ordering principle, there must be a smallest positive integer, say k, for which 
A(k) is false. (We apply the well-ordering principle to the set of all positive integers n for 
which A(n) is false. Statement (ii) says that this set is nonempty.) This k must be greater 
than 1, because we have verified that A(l) is true. Also, the assertion must be true for 
k — 1, since k was the smallest integer for which A(k) is false; therefore we may write 


A(k - 1): l 2 + 2 2 + + (k - l) 2 = H) 3 + M) 2 + — <r 

Adding k 2 to both sides and simplifying the right-hand side, we find 


o _ o , , 2 /c 3 , fc 2 , k 

l 2 +2 2 +. . . + k 2 =-+-+- 
3 2 6 


But this equation states that A(k) is true; therefore we have a contradiction, because k is 
an integer for which A(k) is false. In other words, statement (ii) leads to a contradiction. 
Therefore (i) holds, and this proves that the identity in question is valid for all values of 
n > 1. An immediate consequence of this identity is the rightmost inequality in (1.19). 

A proof like this which makes use of the well-ordering principle is also referred to as 
a proof by induction. Of course, the proof could also be put in the more usual form in 
which we verify A( 1) and then pass from A(k) to A(k + 1). 


14.4 Exercises 

1. Prove the following formulas by induction : 

(a) 1+2 + 3 + . • . + n= n(n + l)/2. 

(b) 1 + 3 + 5 + . + (2n - 1) = n 2 . 

(c) 1” + 2 3 + 3 3 + • + « 3 = (1 + 2 + 3 + + nf. 

(d) l 3 + 2 3 + • + (n — l) 3 < n 4 /4 < l 3 + 2 3 + . . . + « 3 , 
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2. Note that 

1 = 1 , 

1 - 4 = -(1 + 2)) 

1 - 4+9 =1 +2 + 3 , 

1 - 4 + 9 - 16 = -(1 + 2 +3 + 4). 

Guess the general law suggested and prove it by induction. 

3. Note that 

i +i=2-i, 

i + i + i = 2 - i , 

1 +5 +1 +5 =2 

Guess the general law suggested and prove it by induction. 

4. Note that 

1 2 ~ 2 > 

(1 - m - i) = i , 

(1 - i)(l - i)(l - 1) = i . 

Guess the general law suggested and prove it by induction. 

5. Guess a general law which simplifies the product 



and prove it by induction. 

6. Let A(n) denote the statement: 1 + 2 + . + /j = \(2n + l) 2 , 

(a) Prove that if A(k) is true for an integer k, then A(k + 1) is also true. 

(b) Criticize the statement : “By induction it follows that A(n) is true for all rt.” 

(c) Amend A(n) by changing the equality to an inequality that is true for all positive integers n. 

7. Let be the smallest positive integer n for which the inequality (1 + x) n >1 + nx + nx 2 is 

true for all x > 0. Compute /jj , and prove that the inequality is true for all integers n > . 

8. Given positive real numbers flj , fl 2 , a 3 , . . . , such that a, < ca n _ j for all n > 2, where c is a 
fixed positive number, use induction to prove that a, < a x c n ~ l for all n> L 

9. Prove the following statement by induction: If a line of unit length is given, then a line of 

length \/~n can be constructed with straightedge and compass for each positive integer n. 

10. Let b denote a fixed positive integer. Prove the following statement by induction: For every 
integer //> 0, there exist nonnegative integers ^ and r such that 

n = qb + r , 0 <r <b . 

11. Let n and d denote integers. We say that d/S a divisor of n if n = cd for some integer c. An 
integer n is called a prime if n'.> 1 and if the only positive divisors of n are 1 and n. Prove, by 
induction, that every integer n > 1 is either a prime or a product of primes. 

12. Describe the fallacy in the following “proof” by induction: 

Statement. Given any collection of n blonde girls. If at least one of the girls has blue eyes, 
then all n of them have blue eyes. 

“ Proof . The statement is obviously true when n = 1 . The step from k to k + 1 can 
be illustrated by going from n = 3 to ri = 4. Assume, therefore, that the statement is true 
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when n = 3 and let G l5 G„ G„ G 4 be four blonde girls, at least one of which, say G 4 , has blue 
eyes. Taking G lf G 2 , and G a together and using the fact that the statement is true when /j = 3, 
we find that G 2 and G 3 also have blue eyes. Repeating the process with G„ G„ and G 4 , we find 
that G 4 has blue eyes. Thus all four have blue eyes. A similar argument allows us to make 
the step from k to k + 1 in general. 

Corollary. All blonde girls have blue eyes. 

Proof. Since there exists at least one blonde girl with blue eyes, we can apply the foregoing 
result to the collection consisting of all blonde girls. 

Note: This example is from G. Polya, who suggests that the reader may want to test the 

validity of the statement by experiment. 


*14,5 Proof of the well-ordering principle 

In this section we deduce the well-ordering principle from the principle of induction. 

Let T be a nonempty collection of positive integers. We want to prove that T has a 
smallest member, that is, that there is a positive integer / 0 in 7 such that t 0 < t for all t in T. 

Suppose T has no smallest member. We shall show that this leads to a contradiction. 
The integer 1 cannot be in T (otherwise it would be the smallest member of T). Let S 
denote the collection of all positive integers n such that n< t for all t in T. Now 1 is in 5 
because 1 < t for all t in 7. Next, let k be a positive integer in S. Then k < t for all t in T. 
We shall prove that k -f 1 is also in S. If this were not SO, then for some t l in T we would 
have t 1 < k + 1. Since T has no smallest member, there is an integer / 2 in T such that 
t 2 <C t 1 , and hence t 2 < k + 1. But this means that to < k, contradicting the fact that 
k < t for all t in T. Therefore k + 1 is in S. By the induction principle, S contains all 
positive integers. Since Tisnonempty, there is a positive integer t in T. But this t must also 
be in S (since S contains all positive integers). It follows from the definition of S that t < t, 
which is a contradiction. Therefore, the assumption that T has no smallest member leads 
to a contradiction. It follows that T must have a smallest member, and in turn this proves 
that the well-ordering principle is a consequence of the principle of induction. 


14.6 The summation notation 

In the calculations for the area of the parabolic segment, we encountered the sum 
(1.20) 1 2 + 2 2 + 3 2 + • • ■ + « 2 • 

Note that a typical term in this sum is of the form k 2 , and we get all the terms by letting k 
run through the values 1, 2, 3 There is a very useful and convenient notation which 
enables us to write sums like this in a more compact form. This is called the summation 
notation and it makes use of the Greek letter sigma, I. Using summation notation, we can 
write the sum in (1.20) as follows: 

n 

Ik 2 . 

k = 1 

This symbol is read: “The sum of k 2 for k running from 1 to n.” The numbers appearing 
under and above the sigma tell us the range of values taken by k. The letter k itself is 
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referred to as the index of summation. Of course, it is not important that we use the letter 
k; any other convenient letter may take its place. For example, instead of Zk=i k 2 we could 
write 2f=i /*, ZUf- Z'Li m2 > etc -’ all of which are considered as alternative notations for 
the same thing. The letters i, j, k, m, etc. that are used in this way are called dummy indices. 
It would not be a good idea to use the letter « for the dummy index in this particular example 
because n is already being used for the number of terms. 

More generally, when we want to form the sum of several real numbers, say a, , a„ . . . , 
a n . we denote such a sum by the symbol 

(1.21) a, + + . . . + a, 

which, using summation notation, can be written as follows: 


( 1 . 22 ) 

For example, we have 


Z a k ■ 

k=l 

Z a h = «1 + «2 + a 3 + a i > 

Jfc=l 

5 

Z Xi = X 4 + x 2 + x 3 + x 4 + x 5 


Sometimes it is convenient to begin summations from 0 or from some value of the index 
beyond 1. For example, we have 

4 

Z x \ = X 0 + Xj + X 2 + Xg + X 4 , 


£n 3 = 2 3 + 3 3 + 4 3 + 5 3 , 

n = 2 

Other uses of the summation notation are illustrated below: 


jr x m+1 = X + X 2 + x 3 + x 4 + X 5 , 


m = 0 

2 2 s - 1 = 1 + 2 + 2 2 + 2 3 + 2 1 + 2 5 

3 = 1 

To emphasize once more that the choice of dummy index is unimportant, we note that the 
last sum may also be written in each of the following forms: 

e 55 e 

22 8 " 1 = Z 2 r = Z 2i ~ n = Z 26 ~ k - 

q=l r — 0 w=0 k—1 

Note: From a strictly logical standpoint, the symbols in (1.21) and (1.22) do not appear 

among the primitive symbols for the real-number system. In a more careful treatment, we 
could define these new symbols in terms of the primitive undefined symbols of our system. 
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This may be done by a process known as definition by induction which, like proof by induc- 
tion, consists of two parts: 

(a) We define 

1 

I«t = «i . 

fc=l 

(b) Assuming that we have defined ^ or a ^ xec ^ W > 1, we further define 

«+i In \ 

2 °k = 2 + a n+l ■ 

k= 1 \k = 1 / 


To illustrate, we may take ft = 1 in (b) and use (a) to obtain 

2 1 

2 °k = 2 a k + fl 2 = «1 + «2 • 

fc=l *-=1 

Now, having defined a k' we can use (b) again with n= 2 to obtain 
3 2 

2 ^ 1 2 + t?3 =: (tr 4 + of] + £t 3 . 

fc=i i 

By the associative law for addition (Axiom 2), the sum (flj + a.,) + fl 3 is the same as 
a, + (aj + a,), and therefore there is no danger of confusion if we drop the parentheses 
and simply write a, + a 2 + fl 3 for o k . Similarly, we have 

4 3 

Y a k = ^ tfj. + t? 4 = 0?i + ~b tt 3 ) + fl 4 ' 

k=l k = 1 


In this case we can proue that the sum (flj + + a 3 ) + o 4 is the same as (aj + a 2 ) + 

(a 3 + a 4 ) or a, + (a 2 + « 3 + aj, and therefore the parentheses can be dropped again with- 
out danger of ambiguity, and we agree to write 


4 

^a k = a, + a. 2 + fl 3 + fl 4 . 


k = 1 


Continuing in this way, we find that (a) and (b) together give us a complete definition of 
the symbol in (1.22). The notation in (1.21) is considered to be merely an alternative way of 
writing (1.22). It is justified by a general associative law for addition which we shall not 
attempt to state or to prove here. 

The reader should notice that definition by induction and proof by induction involve the 
same underlying idea. A definition by induction is also called a recursiue definition. 


14.7 Exercises 


Find the numerical values of the following sums : 
4 3 5 


(a )%k, (c)Y2™ (e)2(2i+l), 

k=l r=u i=0 


n = 2 
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2. Establish the following properties of the summation notation: 

n n n 

(a) 2 («* + b k ) -J,a k + yb k (additive property). 

*= i fc= 1 1 

n n 

(b) ^ (ca k ) = c ^ a k (homogeneous property). 

1 A:=l 

n 

(c) 20*fc — a fc-i) = o n “ fl o (telescoping property). 

k = 1 

Use the properties in Exercise 2 whenever possible to derive the formulas in Exercises 3 
through 8. 


n 

3. 2 1 = «• (This means y?=i a„ where each a k = 1.) 
k-1 

4. (2k - 1) = n 2 . [ffinf; 2k ~ l = k 2 ~ (k - l) 2 .] 

5.2* = 

fc = 1 

6 .Y * 1 =- +? + 2 - A: 3 - (k - l ) 3 = 3& 2 -3k + 1 .] 

*=f 3 2 6 


n 2 « 
2 + 2 ' 


[Hint: Use Exercises 3 and 4.] 



k = 1 


(a) y .U 


1 - x n+1 


fc =0 


if x ^ 1. Note: Jt° 


[Hint: Apply Exercise 2 to (1 - x) at*.] 
(b) What is the sum equal to when x = 1? 


is defined to be 1. 


9. Prove, by induction, that the sum ( — 1)^(2 k + 1) is proportional to n, and find the 

constant of proportionality. 

10. (a) Give a reasonable definition of the symbol a,. 

(b) Prove, by induction, that for tj > 1 we have 


2 n . 2 n 

Zi-2 


(-i ) m+1 




m=l 


11. Determine whether each of the following statements is true or false. In each case give a 
reason for your decision. 


loo ioo 

(a) 2 « 4 = J « 4 . 

n= 0 71=1 

100 

(b) j 2 = 200. 

5=0 


(d) 2(i+ U 2 = y/ 2 . 

t=l M o 

100 /100 \ /IOO \ 

(e) 2* 8 = 12* •(2* 1 )- 

fc=i \fc=i / u=i / 


1UU 100 100 /100 \3 

(c)2(2 + k)= 2+ yk. (f)2> 3 = 2* . 

&=0 i=0 \jfc=0 / 
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12. Guess and prove a general rule which simplifies the sum 

n j 

+ 1 ) ’ 


i 


13. Prove that 2 (V n + 1 — Vn) < — 7 = < 2 (Vn — V« — 1) if« 1. Then use this to prove 
that Vn 


m 

iVm — 2 < ~~r < 2-\/w — 1 


Zi Vn 

if m >2. In particular, when m = 10 6 , the sum lies between 1998 and 1999. 


14.8 Absolute values and the triangle inequality 

Calculations with inequalities arise quite frequently in calculus. They are of particular 
importance in dealing with the notion of absolute value. If x is a real number, the absolute 
value of x is a nonnegative real number denoted by |x| and defined as follows: 

(x if x > 0 , 

\*\ = 

|- x if x < 0 . 

Note that — |x| < X ;< |x|. When real numbers are represented geometrically on a real axis, 
the number |x| is called the distance of x from 0. If a > 0 and if a point x lies between -a 
and a, then |x| is nearer to 0 than a is. The analytic statement of this fact is given by the 
following theorem. 


theorem 1.38. If a >0, then |x| <a if and only if -a < x < a. 


Proof. There are two statements to prove: first, that the inequality |x| < a implies the 
two inequalities -a < x <C a and, conversely, that -a < x < a implies |x| < a. 

Suppose |;v| < a. Then we also have -a < — |x|. But either x = \x\ or x = — |v| and 
hence -a < — |x| <x < |x| < a. This proves the first statement. 

To prove the converse, assume -a < x K a. Then if x > 0, we have |x| = X < a, 
whereas if x <0, we have |x| = —x < a. In either case we have |x| < a, and this com- 
pletes the proof. 

Figure 1.9 illustrates the geometrical significance of this theorem. 


|x| < a in this interval 


1 m ¥: t 


* 


a 0 a 


Figure 1.9 Geometrical significance of Theorem 1.38. 


As a consequence of Theorem 1.38, it is easy to derive an important inequality which 
states that the absolute value of a sum of two real numbers cannot exceed the sum of their 
absolute values. 
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theorem 1.39. For arbitrary real numbers x and y, we have 

\x + y\< |*| + | y \ , 

Note: This property is called the triangle inequality, because when it is generalized to 
vectors it states that the length of any side of a triangle is less than or equal to the sum of 
the lengths of the other two sides, 

Proof. Adding the inequalities — 1*| < x < |x| and — |j| < y < \y\, we obtain 

-(1*1 +W)<* + Y<W+W, 

and hence, by Theorem 1.38, we conclude that \x + y\ < |x| + |j|. 

If we take x — a — c and y ~ c — b, then x + y = a »■ b and the triangle inequality 
becomes 

\a — b\ < \a <- c| + \b — c\ . 

This form of the triangle inequality is often used in practice. 

Using mathematical induction, we may extend the triangle inequality as follows: 

THEOREM 1 . 40. For arbitrary real numbers a„ a 2 , . . . , a„ we have 


J, a k <2>*J • 

1 4=1 4=1 

Proof. When n = 1 the inequality is trivial, and when n = 2 it is the triangle inequality. 
Assume, then, that it is true for n real numbers. Then for n + 1 real numbers a, , a 2 , . . . , 
a n+1 , we have 


n + 1 


n 


n 

n 

n+1 

1«4 

- 

'E a k + a n+i 

k= 1 

< 

Z fl 4 

k= 1 

+ Wn+l\ <2 \ a k\ + l a «+ll 

= 2l«4 


Hence the theorem is true for n + 1 numbers if it is true for n. By induction, it is true for 
every positive integer n. 

The next theorem describes an important inequality that we shall use later in connection 
with our study of vector algebra. 

THEOREM 1.41. THE C AUCH Y -S CHW ARZ INEQUALITY. If , . ■ ..B. BUdb,, . . . , b n QT£ 

arbitrary real numbers, we have 

11.23) (|yAji(|4)(|«). 

The equality sign holds if and only if there is a real number x such that a k x + b k = Oforeach 

k s= 1, 2, . . . , n. 
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Proof. We have ^ILi ( a k x + b k ) 2 > 0 for every real x because a sum of squares can 
never be negative. This may be written in the form 

(1.24) Ax 2 + 2Bx + C > 0 , 

where 

B=2a k b k , C=2 i b 2 k . 

*=1 fc=l k=l 

We wish to prove that B 2 < AC. If A = 0, then each a k = 0, so B = 0 and the result is 
trivial. If A ^ 0, we may complete the square and write 

Ax 2 + 2 Bx + C = a(x + -X + AC — — , 

\ A! A 

The right side has its smallest value when x = —BjA. Putting x = —B/Ain (1.24), we 
obtain B 2 < AC. This proves (1.23). The reader should verify that the equality sign holds 
if and only if there is an x such that a k x + b k = 0 for each k. 

14.9 Exercises 

1. Prove each of the following properties of absolute values. 

(a) |*| = 0 if and only if x = 0. (f) \xy\ = |*j \y\. 

(b) | -*| = \x\. (g) \xjy\ = \x\l\y\ ify ^ 0. 

(c) \x-y\=\y - x\. (h) \x - y\ <. \x\ + \y\. 

(d) |x| 2 = x (i) \x\ - \y\ <, \x -y\. 

(e) |*| = V*2. (j) | |*| - |j[ < |* -y\. 

2. Each inequality (aj, listed below, is equivalent to exactly one inequality (/>)). For example, 
|*| < 3 if and only if -3 < x < 3, and hence (rq) is equivalent to (b 2 ). Determine all equivalent 
pairs. 

(fli) |*| < 3 . ((q) 4 < x < 6. 

(a 2 ) lx - II < 3. ( b 2 ) -3 < x < 3. 

(o 3 ) |3 — 2*7 < 1. (7> 3 ) * > 3 or x < -1. 

(a 4 ) II + 2*1 ^ 1. (6 4 ) x > 2, 

(c 5 ) lx — 11 > 2. (b b ) -2 < x < 4. 

(fl 6 ) lx + 2| > 5. (7> 6 ) - V3^*<-lo r 1<*<V3. 

(a 7 ) |5 - *- J l < 1, (b 7 ) 1 < x < 2. 

(« 8 ) |ar — 51 < |* + 1|. (b s ) x < — 7 or x > 3. 

(a 9 ) |* 2 - 2] <; 1. (b.) \<x < h 

(a 10 ) x < * 2 - 12 < 4x. (b 10 ) - 1 < * < 0. 

3. Determine whether each of the following is true or false. In each case give a reason for your 
decision. 

(a) x < 5 implies |*j < 5. 

(b) 1* — 5| < 2 implies 3 < x < 7. 

(c) 1 1 + 3*| < 1 implies x > — |. 

(d) There is no real x for which |* — 1) = |* — 2|. 

(e) For every x > 0 there is a y > 0 such that |2* + y\ = 5. 

4. Show that the equality sign holds in the Cauchy-Schwarz inequality if and only if there is a real 
number x such that a k x + b k = 0 for every k = 1, 2, . . . , n. 
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*1 4.10 Miscellaneous exercises involving induction 

In this section we assemble a number of miscellaneous facts whose proofs are good exercises in 
the use of mathematical induction. Some of these exercises may serve as a basis for supplementary 
classroom discussion. 

Factorials and binomial coefficients. The symbol n! (read "n factorial”) may be defined by in- 
duction as follows: 0! = 1, n\ = (n — 1)! n if n > 1. Note that «! = 1 • 2 ■ 3 • • ■ n. 

If 0 < k < n, the binomial coefficient (%) is defined as follows: 

i n \ n ' 

\*/ = k! (n - k)i ' 


Note: Sometimes n C k is written for ((!). These numbers appear as coefficients 
in the binomial theorem. (See Exercise 4 below.) 

1. Compute the values of the following binomial coefficients : 

(a) (5), (b)(5), (c)(1), (d)(1), (e) (II), (f)(8). 

2. (a) Show that (£) = („ ” k ). (c) Find k, given that (\f) = { k ffi 4 ). 

(b) Find n, given that ( ” 0 ) = (”). (d) Is there a k such that (ffi) = ( 3 )? 

3. Prove that ( ?i ^ 1 ) = ( 2. x ) + ( k ). This is called the law of Pascal's triangle and it provides a 
rapid way of computing binomial coefficients successively. Pascal’s triangle is illustrated here 

for n 6. 


1 2 1 

13 3 1 

14 6 4 1 

1 5 10 10 5 

1 6 15 20 15 6 

4. Use induction to prove the binomial theorem 


(a + bf = 



glcfjn-k. _ 


Then use the theorem to derive the formulas 



= 2 ” 


and 



if n > 0. 


The product notation. The product of n real numbers a„ a 2 ,....a, is denoted by the symbol 
TJU a - which may be defined by induction. The symbol a x a 2 ■ ■ ■ a, is an alternative notation for 
this product. Note that 

n 

n! =17 k. 

k=l 


5. Give a definition by induction for the product 
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Prove the following properties of products by induction: 

6. 17 (a k b k ) = m «*) TTM (multiplicative property). 

k= 1 U=1 / U*=l / 

An important special case is the relation JJfei ( ca k) = ° n TTifc=i a k- 

n (i d 

7. TT =— if each ^ 0 (telescoping property). 

k = 1 a k - 1 «o 

8. If x ,4 1, show that 

n 1 _ v 2 n 

TTa + ^ 7 =- — — . 

ft=l l — X 

What is the value of the product when x = 1 ? 

9. If a k < b k for each k = 1, 2, . . . , it is easy to prove by induction that ^ =1 as- < '£ r k=l b k . 
Discuss the corresponding inequality for products: 


n n 

TT a k < TT h- 

k = 1 S= 1 

Some special inequalities 

10. If x > 1, prove by induction that x n > x for every integer n > 2. If 0 <x< 1, prove that 
x n < x for every integer « > 2. 

11. Determine all positive integers n for which 2" < n!. 

12. (a) Use the binomial theorem to prove that for n a positive integer we have 



(b) If n > 1, use part (a) and Exercise 11 to deduce the inequalities 


, 1\" 

2 < 1 1 + - 1 < 1 + 


« , 
2r: 


< 3. 


13. (a) Let p be a positive integer. Prove that 

b v -a v = (b - a)(b" ' + b v ~ 2 a + b v ~ s a 2 + ... + ba v ~ 2 + a®- 1 ) 

Use the telescoping property for sums.] 

(b) Let p and n denote positive integers. Use part (a) to show that 


(n + l yp+^ n p+i 

n v < 7 - 7 - ; < (n + l )* 1 


p + 1 
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(c) Use induction to prove that 


n — 1 , ji 


i 


fc=i 


Part (b) will assist in making the inductive step from n to n + 1. 

14. Let a x , . . . , a n be n real numhers, all having the same sign and all greater than -1. Use 
induction to prove that 


(1 + flj)(l + a, “) . . ■ (1 + ci n ) ^ 1 +a, +a, + • ■ • + a n , 
In particular, when a, = a 2 = ' • • = Cl n = x, where x > -1, this yields 


(125) 


(1 + x) n > 1 + nx (Bernoulli's inequulity). 


Show that when « > 1 the equality sign holds in (1.25) only for x = 0. 

15. If n > 2, prove that «!/«** < (|) fc , where k is the greatest integer < «/2. 

16. The numbers 1, 2, 3, 5, 8, 13, 21, .... in which each term after the second is the sum of its 
two predecessors, are called Fibonucci numbers. They may be defined by induction as follows : 


Prove that 


a i = !- «a = 2 > a n + 1 = a n + a n - 1 1 f « > 2 . 

< / 1 + 

°n '[— ) 

for every n > 1 ■ 

Znequulities relating di’rent types of uveruges. Let Xi , x 2 x n be n positive real numbers. 

If p is a nonzero integer, the pth-power mean M t of the n numbers is defined as follows : 

n + . . . + -yPXl/v 


Ixf + • • • + 

M ° - ( n ) 


The number M x is also called the arithmetic mean , M 2 the root mean square, and M_j the 
hurmonic mean, 

17. Ifp > 0, prove that M v < M 23 , when x x , x 2 , . . . , x n are not all equal. 

[Hint: Apply the Cauchy-Schwarz inequality with = x\ and b k = 1.] 

18. Use the result of Exercise 17 to prove that 

a 4 + 6 4 + c 4 > ¥ 


if a 2 + b 2 + c 2 = 8 and a > 0, b > 0, c > 0. 

19. Let a, , . . . , a n be n positive real numbers whose product is equal to 1. Prove that a, + "' + 
a, > « and that the equality sign holds only if every flj = L 

[Hint: Consider two cases: (a) All a k = 1; (b) not all a k = 1. Use induction. In case 
(b) notice that if a x a 2 . . . a n+1 = 1 , then at least one factor, say a x , exceeds 1 and at least 
one factor, say a n+1 , is less than 1. Let b x = and apply the induction hypothesis to 
the product b x a 2 ■ . ■ a, , using the fact that (a x - l)(a„ +1 - 1) < 0.] 
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20. The geometric mean G of n positive real numbers Xj , . . . , x n is defined by the formula G = 
(* 1*2 • • ' x n) Un - 

(a) Let M v denote the pth power mean. Prove that G < M 1 and that G = M 1 only when 

= X2 = • — X 11 ■ 

(b) Let p and q be integers, q < 0 < p. From part (a) deduce that M q < G < M v when x 1 , 
x 2 > . , , , x n are not all equal. 

21. Use the result of Exercise 20 to prove the following statement : If a, b, and c are positive real 

numbers such that abc = 8, then a + b + c > 6 and ab + ac + be > 12. 

22. If Xi, ... ,x n are positive numbers and if y, = 1 lx k , prove that 

(Mi>) 

23. If a, b, and c are positive and ifa + 6 + c= 1, prove that (1 — a)(l — b)(l — c) > 8abc. 
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THE CONCEPTS OF INTEGRAL CALCULUS 


In this chapter we present the definition of the integral and some of its basic properties. 
To understand the definition one must have some acquaintance with the function concept; 
the next few sections are devoted, to an explanation of this and related ideas. 


1.1 The basic ideas of Cartesian geometry 

As mentioned earlier, one of the applications of the integral is the calculation of area, 
Ordinarily we do not talk about area by itself. Instead, we talk about the area °f something. 
This means that we have certain objects (polygonal regions, circular regions, parabolic 
segments, etc.) whose areas we wish to measure. If we hope to arrive at a treatment of area 
that will enable us to deal with many different kinds of objects, we must first find an effective 
way to describe these objects. 

The most primitive way of doing this is by drawing figures, as was done by the ancient 
Greeks. A much better way was suggested by Rene Descartes ( 1596 - 1650 ), who introduced 
the subject of analytic geometry (also known as Cartesian geometry). Descartes’ idea was 
to represent geometric points by numbers. The procedure for points in a plane is this: 

Two perpendicular reference lines (called coordinate axes) are chosen, one horizontal 
(called the “x-axis”), the other vertical (the “_y-axis”). Their point of intersection, denoted 
by 0, is called the origin. On the x-axis a convenient point is chosen to the right of 0 and 
its distance from 0 is called the unit distance. Vertical distances along the y-axis are usually 
measured with the same unit distance, although sometimes it is Convenient to use a different 
scale on the y-axis. Now each point in the plane (sometimes called the xy-plane) is assigned 
a pair of numbers, called its coordinates. These numbers tell us how to locate the point. 
Figure 1.1 illustrates some examples. The point with coordinates (3, 2) lies three units to 
the right of they-axis and two units above the x-axis. The number 3 is called the x-coordinate 
of the point, 2 its y-coordinate. Points to the left of the y-axis have a negative x-coordinate; 
those below the x-axis have a negative y-coordinate. The x-coordinate of a point is some- 
times called its abscissa and the y-coordinate is called its ordinate. 

When we write a pair of numbers such as (a, b) to represent a point, we agree that the 
abscissa or x-coordinate, a, is written first. For this reason, the pair (a, b) is often referred 
to as an orderedpair. It is clear that two ordered pairs (a, b) and (c, d) represent the same 
point if and only if we have a == c and b = d. Points (a, b) with both a and b positive 
are said to lie in the first quadrant', those with a < 0 and b > 0 are in the second quadrant; 
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those with a < 0 and b < 0 are in the third quadrant; and those with a > 0 and b < 0 
are in the fourth quadrant. Figure 1.1 shows one point in each quadrant. 

The procedure for points in space is similar. We take three mutually perpendicular 
lines in space intersecting at a point (the origin). These lines determine three mutually 
perpendicular planes, and each point in space can be completely described by specifying, with 
appropriate regard for signs, its distances from these planes. We shall discuss three-dimen- 
sional Cartesian geometry in more detail later on; for the present we confine our attention 
to plane analytic geometry. 

A geometric figure, such as a curve in the plane, is a collection of points satisfying one 
or more special conditions. By translating these conditions into expressions involving the 


y-aAis 



y 



Figure 1.2 The circle repre- 
sented by the Cartesian equation 
x 2 + y 2 = r 2 . 


coordinates x and y, we obtain one or more equations which characterize the figure in 
question. For example, consider a circle of radius r with its center at the origin, as shown 
in Figure 1.2. Let P be an arbitrary point on this circle, and suppose P has coordinates 
(x, y). Then the line segment OP is the hypotenuse of a right triangle whose legs have 
lengths |x| and \y\ and hence, by the theorem of Pythagoras, 

X 2 + y 2 = r 2 . 

This equation, called a Cartesian equation of the circle, is satisfied by all points (x, y) on 
the circle and by no others, so the equation completely characterizes the circle. This 
example illustrates how analytic geometry is used to reduce geometrical statements about 
points to analytical statements about rea l numbers. 

Throughout their historical development, calculus an d analytic geometry have been 
intimately intertwined. New discoveries in one subject led to improvements in the other. 
The development of calculus and analytic geometry in this book is similar to the historical 
development, in that the two subjects are treated together. However, our primary purpose 
is to discuss calculus. Concepts from analytic geometry that are required for this purpose 
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will be discussed as needed. Actually, only a few very elementary concepts of plane analytic 
geometry are required to understand the rudiments of calculus. A deeper study of analytic 
geometry is needed to extend the scope and applications of calculus, and this study will be 
carried out in later chapters using vector methods as well as the methods of calculus. 
Until then, all that is required from analytic geometry is a little familiarity with drawing 
graphs of functions. 

1.2 Functions. Informal description and examples 

Various fields of human endeavor have to do with relationships that exist between one 
collection of objects and another. Graphs, charts, curves, tables, formulas, and Gallup polls 
are familiar to everyone who reads the newspapers. These are merely devices for describing 
Special relations in a quantitative fashion. Mathematicians refer to certain types of these 
relations as functions. In this section, we give an informal description of the function 
concept. A formal definition is given in Section 1.3. 

example 1, The force F necessary to stretch a steel spring a distance x beyond its natural 

length is proportional to x. That is, F = cx, where c is a number independent of x called 

the spring constant. This formula, discovered by Robert Hooke in the mid- 17th Century, is 
called Hooke’s law, and it is said to express the force as a function of the displacement. 

example 2. The volume of a cube is a function of its edge-length. If the edges have 

length x, the volume Vis given by the formula V = x'\ 

example 3. A prime is any integer n > 1 that cannot be expressed in the form n = ab, 
where a and b are positive integers, both less than n. The first few primes are 2, 3, 5, 7, 11, 
13, 17, 19. For a given real number x > 0, it is possible to count the number of primes less 
than or equal to x. This number is said to be a function of x even though no simple algebraic 
formula is known for computing it (without counting) when x is known. 

The word “function” was introduced into mathematics by Leibniz, who used the term 
primarily to refer to certain kinds of mathematical formulas. It was later realized that 
Leibniz’s idea of function was much too limited in its scope, and the meaning of the word 
has since undergone many stages of generalization. Today, the meaning of function is 
essentially this : Given two sets, say X and Y, afunction is a correspondence which associates 
with each element of X one and only one element of Y. The set X is called the domain of the 
function. Those elements of Y associated with the elements in X form a set called the range 
of the function. (This may be all of Y, but it need not be.) 

Letters of the English and Greek alphabets are often used to denote functions. The 
particular letters /, g, h, F, G, H, and <p are frequently used for this purpose. Iff is a given 
function and if x is an object of its domain, the notation f(x) is used to designate that object 
in the range which is associated to x by the function f and it is called the value off at x 
or the image of x under f. The symbol f(x) is read as “f of x.” 

The function idea may be illustrated schematically in many ways. For example, in 
Figure 1.3(a) the collections X and Y are thought of as sets of points and an arrow is used 
to suggest a “pairing’’ of a typical point x in X with the image point f(x) in Y. Another 
scheme is shown in Figure 1.3(b). Here the function/is imagined to be like a machine into 
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Figure 1.3 Schematic representations of the function idea. 


which objects of the collection X are fed and objects of Y are produced. When an object x 
is fed into the machine, the output is the objeCt/(jc). 

Although the function idea places no restriction on the nature of the objects in the domain 
X and in the range Y, in elementary calculus we are primarily interested in functions whose 
domain and range are sets of real numbers. Such functions are called real-valued functions 
of a real variable, or, more briefly, real functions, and they may be illustrated geometrically 
by a graph in the xy-plane. We plot the domain X on the x-axis, and above each point x in 
X we plot the point (x, y), where y = f (x). The totality of such points (x, y) is called the 
graph of the function. 

Now we consider some more examples of real functions. 

example 4. The identity function. Suppose that f(x) = x for all real x. This function 
is often called the identity function. Its domain is the real line, that is, the set of all real 
numbers. Here x = y for each point (x, y) on the graph off. The graph is a straight line 
making equal angles with the coordinates axes (see Figure 1.4). The range off is the set of 
all real numbers. 

example 5. The absolute- value function. Consider the function which assigns to each 
real number x the nonnegative number |x|. A portion of its graph is shown in Figure 1.5. 


y 



Figure 1.4 Graph of the identity Figure 1.5 Absolute-value 

function f{x) = x. function q(x) = |*|. 
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Denoting this function by <p, we have (p(x) = |x| for all real x. For example, (p(0) = 0, 
<p( 2) = 2, (pi — 3) = 3. We list here some properties of absolute values expressed in function 
notation. 

(a) 9 f-x) = <p(x). (d) <p[<p(x)\ - (f,(x) . 

(b) <p(x 2 ) = x 2 . (e) <p{x) = V x 2 . 

(c) <p(x + y) < 9 v(x) + 9 o{y) (the triangle inequality) . 


example 6. Theprime-numberfimction. For any x > 0, let n(x) be the number of primes 
less than or equal to x. The domain of tt is the set of positive real numbers. Its range is the 
set of nonnegative integers {0, 1, 2, . . . }. A portion of the graph of tt is shown in Figure 1.6. 
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Figure 1.7 The factorial 
function. 


(Different scales are used on the x- and y-axes.) As x increases, the function value n(x) 
remains constant until x reaches a prime, at which point the function value jumps by 1. 
Therefore the graph of tt consists of horizontal line segments. This is an example of a class 
of functions called step functions; they play a fundamental role in the theory of the integral. 

example 7. The factorial function. For every positive integer n, we define f ( D ) to be 
n ! = 1 • 2 ' ' • n. In this example, the domain off is the set of positive integers. The 
function values increase so rapidly that it is more convenient to display this function in 
tabular form rather than as a graph. Figure 1.7 shows a table listing the pairs (n, n!) for 
n — 1 , 2 , ... , 10. 

The reader should note two features that all the above examples have in common. 

(1) For each ,v in the domain X there is one and only one image y that is paired with that 
particular x. 

(2) Each function generates a set of pairs (x, y), where x is a typical element of the 
domain X, and y is the unique element of Y that goes with x. 

In most of the above examples, we displayed the pairs (x, y) geometrically as points on a 
graph. In Example 7 we displayed them as entries in a table. In each case, to know the 
function is to know, in one way or another, all the pairs (x, y ) that it generates. This simple 
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observation is the motivation behind the formal definition of the function concept that is 
given in the next section. 


* 1.3 Functions. Formal definition as a set of ordered pairs 

In the informal discussion of the foregoing section, a function was described as a corre- 
spondence which associates with each object in a set X one and only one object in a set Y. 
The words “correspondence” and “associates with” may not convey exactly the same 
meaning to all people, so we shall reformulate the whole idea in a different way, basing it on 
the set concept. First we require the notion of an orderedpair of objects. 

In the definition of set equality, no mention is made of the order in which elements 
appear. Thus, the sets {2 ,5} and {5, 2} are equal because they consist of exactly the same 
elements. Sometimes the order is important. For example, in plane analytic geometry the 
coordinates (x, y) of a point represent an ordered pair of numbers. The point with co- 
ordinates (2, 5) is not the same as the point with coordinates (5, 2), although the sets (2, 5} 
and {5, 2) are equal. In the same wa y, if we have a pair of objects a and b (not necessarily 
distinct) and if we wish to distinguish one of the objects, say a, as the first member and the 
other, b, as the second, we enclose the objects in parentheses, (a, b). We refer to this as an 
ordered pair. We say that two ordered pairs (a, b) and (c, d) are equal if and only if their 
first members are equal and their second members are equal. That is to say, we have 

(a, b) = (c, d) ifandonlyif a = c and b = d . 

Now we may state the formal definition of function. 

definition of function. A function f is a set of ordered pairs (x, y) no two of which 
have the same first member. 


If/ is a function, the set of all elements x that occur as first members of pairs (x, y) in f 
is called the domain off. The set of second members y is called the range off, or the set of 
values off. 

Tntuitively, a function can be thought of as a table consisting of two columns. Each 
entry in the table is an ordered pair (x, y); the column of Ys is the domain off, and the 
column of y’s, the range. If two entries (x, y) and (x, z) appear in the table with the same 
x-value, then for the table to be a function it is necessary that y = z. In other words, a 
function cannot take two different values at a given point x. Therefore, for every x in the 
domain off there is exactly one y such that (x, y) ef. Since this y is uniquely determined 
once x is known, we can introduce a special symbol for it. It is customary to write 

Y =fix) 

instead of (x, y ) £ / to indicate that the pair (x, y) is in the set f. 

As an alternative to describing a function f by specifying explicitly the pairs it contains, 
it is usually preferable to describe the domain off, and then, for each x in the domain, to 
describe how the function value f (x) is obtained. In this connection, we have the following 
theorem whose proof is left as an exercise for the reader. 
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theorem 1.1, Two functions f and g are equal if and only if 

(a) f and g have the same domain, and 

(b) fix) = g(x) for every x in the domain of f 

It is important to realize that the objects x and fix) which appear in the ordered pairs 
(x, / (x)) of a function need not be numbers but may be arbitrary objects of any kind. 
Occasionally we shall use this degree of generality, but for the most part we shall be interested 
in real functions, that is, functions whose domain and range are subsets of the real line. 
Some of the functions that arise in calculus are described in the next few examples. 

1.4 More examples of real functions 

1. Constant functions. A function whose range consists of a single number is called a 
constant function. An example is shown in Figure 1.8, where / (x) = 3 for every real 
x. The graph is a horizontal line cutting the y-axis at the point (0, 3). 


y 



Figure 1.8 A constant Figure 1.9 A linear function Figure 1.10 A quadratic 
function fix) = 3. g(x) = 2x — l. polynomial f(x) = x 2 . 

2. Linear functions. A function g defined for all real x by a formula of the form 

g(x) = ax + b 

is called a linear function because its graph is a straight line. The number b is called 
the y-intercept of the line; it is the y-coordinate of the point (0, b) where the line cuts 
the y-axis. The number a is called the slope of the line. One example, g(x) = x, is 
shown in Figure 1.4. Another, g(x) = 2x — 1, is shown in Figure 1.9. 

3. The power functions. For a fixed positive integer n, let / be defined by the equation 
fix) = a" for all real x. When n = 1, this is the identity function, shown in Figure 1.4. 

For n = 2, the graph is a parabola, part of which is shown in Figure 1.10. For n = 3, 
the graph is a cubic curve and has the appearance of that in Figure 1.11 (p. 56). 
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4. Polynomial jiunctions. A polynomial function P is one defined for all rea l x by an 
equation of the form 


P(x) = c 0 + c x x + • • • + c n x n = i c k x k , 

k = 0 

The numbers c () , c t , . . . , c n are called the coefficients of the polynomial, and the 
nonnegative integer n is called its degree (if c n ^ 0). They include the constant func- 
tions and the power functions as special cases. Polynomials of degree 2, 3, and 4 are 
called quadratic, cubic, and quartic polynomials, respectively. Figure 1.12 shows a 
portion of the graph of a quartic polynomial P given by P(x) = — 2 A' 2 . 

5. The circle. Suppose we return to the Cartesian equation of a circle, x 2 + y 2 - r 2 and 
solve this equation for y in terms of x. There are two solutions given by 

y = V r 2 — x 2 and y = — V r 2 — x 2 , 

(We remind the reader that if a > 0, the symbol s/a denotes the positive square root 
of a. The negative square root is — Va.) There was a time when mathematicians would 
say that y is a double-valuedfunction of x given by y = -j-\/r 2 — x 2 , However, the 
more modem point of view does not admit “double-valuedness” as a property of 
functions. The definition of function requires that for each x in the domain, there 
corresponds one and only one y in the range. Geometrically, this means that vertical 
lines which intersect the graph do so at exactly one point. Therefore to make this 
example fit the theory, we say that the two solutions for y define two functions, say 
f and g, where 

f(x) = Vr 2 - x 2 and g(x ) = -\/ r 2 - x 2 

for each x satisfying -r < x < r. Each of these functions has for its domain the 
interval extending from -r to r. If |a| > r, there is no real y such that x 2 + y 2 = r 2 , 
and we say that the functions f and g are not elejined for such x. Since f (x)'vs, the non- 
negative square root of r 2 — x 2 , the graph off is the upper semicircle shown in Figure 
1.13. The function values of g are < 0, and hence the graph of g is the lower semicircle 
shown in Figure 1.13. 

6. Sums, products, and quotients of functions. Let / and g be two real functions having 
the same domain D. We can construct new functions from f and g by adding, multi- 
plying, or dividing the function values. The function u defined by the equation 

u(x) — f(x) + g(x) if xeD 

is called the sum off and g and is denoted by / + g. Similarly, the product v = f < (J 
and the quotient w = fjg are the functions defined by the respective formulas 


v(x) = f(x)g(x) if xe D, w(x) = f{x)lg(x) if x £ 1) and g(x) / 0. 
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Figure 1.11 A cubic Figure 1.12 A quartic polynomial : Figure 1.13 Graphs of 

polynomial: P(x) = x 3 , P(x) = 2 * 4 — 2x2. two functions: 

fix) = V r 2 - x 3 , 

g(X) = - V r 2 - x 2 . 

The next set of exercises is intended to give the reader some familiarity with the use of 
the function notation. 


1 . 5 Exercises 


1. Let f(x) = x + 1 for all real x. Compute the following: f(2), f{ —2), -f(2), /(|), l//(2), 
f(a+b),f(a)+ f(b),f(a)f(b). 

2. Let f(x) = 1 + x and let g(x) = 1 — x for all real x. Compute the following: f(2) + g(2), 
/( 2) - g{2),f{2)g(2),f{2)lg{2),f [g{2)l gif {2)1 f (a) + g{ -a),f{t)g{ -t), 

3. Let <p(x) = |x — 3| + |x — lj for all real x. Compute the following: f(0), <p( 1), <p{ 2), <p( 3), 
tp{ —1), q>{ -2). Find all t for which tp(t + 2) = <p(t). 

4. Let f{x) = x 2 for all real x. Verify each of the following formulas. In each case describe the 
set of real x, y, t, etc., for which the given formula is valid. 

(a ) /(— x) = /(x). (d) f(2y) = 4/(y). 

(b) fiy) -fix) = {y- x)iy + x). (e) fit 2 ) = fit) 2 . 

(c) f{x + h) -f(x) = 2xh + h”. (f) Vf(a) = la). 

5. Let g(x) = V4 — x 2 for |x| < 2. Verify each of the following formulas and tell for which 
values of x, y, s, and t the given formula is valid. 

(a) gi -x) - g{x). (d) g{a - 2) = V4 a - a 2 . 


(b) g(2y) = 



(c) 





(f) 2 +gix) = 


2 ~g(x) 

v2 


6. Let f be defined as follows: f(x) = 1 for 0 <, x ^ 1 ; f(x) = 2 for ] < x < 2. The function 
is not defined if x < 0 or if x > 2. 


(a) Draw the graph off. 

(b) Let g(x) = f (2x). Describe the domain of g and draw its graph. 

(c) Let h(x) = f{x — 2). Describe the domain of h and draw its graph. 

(d) Let k(x) = fi 2x) + fix — 2). Describe the domain of k and draw its graph. 
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7. The graphs of the two polynomials g(x) = x and f(x) = x 3 intersect at three points. Draw 
enough of their graphs to show how they intersect. 

8. The graphs of the two quadratic polynomials f(x) = x 2 — 2 and g(x) = 2x 2 + 4x + 1 inter- 
sect at two points. Draw the portions of the two graphs between the points of intersection. 

9. This exercise develops some fundamental properties of polynomials. Let f(x) = ^f k=0 c k x k be 
a polynomial of degree n. Prove each of the following: 

(a) If n > 1 and IY Oj = 0, then/(x) = xg{x), whereg is a polynomial of degree n — 1. 

(b) For each real a, the function p given by p(x) = f(x-\- a) is a polynomial of degree n. 

(c) If n > 1 and f(a) = 0 for some real a, then f(x) = (x — a)h(x), where h is a polynomial of 
degree n — 1. [Hint: Consider p(x) = fix + a).] 

(d) If f(x) = 0 for n+ 1 distinct real values of x, then every coefficient c k is zero and/(x) = 0 
for all real x. 

(e) Let^(x) = b k x k be a polynomial of degree m, where m > n. If g{x) = f(x) for rn + I 
distinct real values of x, then in = n, b k = c k for each k, and g(x) =/(x) for all real x. 

10. In each case, find all polynomials p of degree < 2 which satisfy the given conditions. 

(a) p( 0) =p( 1) =p{ 2) = 1. (c) p( 0) =p( 1) = 1. 

(b) p( 0) = P ( 1) = 1 , p(2) = 2. (d) p( 0) = P { 1). 

1 1 . In each case, find all polynomials p of degree < 2 which satisfy the given conditions for all 
real x. 

(a) p(x) =p( 1 - x). (c) p( 2x) = 2 p(x). 

(b) p(x) = p( 1 + x). (d) p( 3x) = p(x + 3). 

12. Show that the following are polynomials by converting them to the form 2 jT=o a k xk f° r a 
suitable m. In each case « is a positive integer. 

1 x nJr ^ . n . 

(a) (1 + xf n . (b) { _ , x 1. (c) 77(1 + x 2 *). 

k=0 

1.6 The concept of area as a set function 

When a mathematician attempts to develop a general theory encompassing many different 
concepts, he tries to isolate common properties which seem to be basic to each of the 
particular applications he has in mind. He then uses these properties as fundamental 
building blocks of his theory. Euclid used this process when he developed elementary 
geometry as a deductive system based on a set of axioms. We used the same process in our 
axiomatic treatment of the real number system, and we shall use it once more in our dis- 
cussion of area, 

When we assign an area to a plane region, we associate a number with a set S in the plane. 
From a purely mathematical viewpoint, this means that we have a function a (an area 
function) which assigns a real number a(S) (the area of S) to each set S in some given 
collection of sets. A function of this kind, whose domain is a collection of sets and whose 
function values are real numbers, is called a setfinction. The basic problem is this : Given a 
plane set S, what area a(S) shall we assign to S? 

Our approach to this problem is to start with a number of properties we feel area should 
have and take these as axioms for area. Any set function which satisfies these axioms will 
be called an area function. Tt> make certain we are not discussing an empty theory, it is 
necessary to show that an area function actually exists. We shall not attempt to do this here. 
Instead, we assume the existence of an area function and deduce further properties from the 
axioms. An elementary construction of an area function may be found in Chapters 14 and 
22 of Edwin E. Moise, Elementary Geometry From An Advanced Standpoint, Addison- 
Wesley Publishing Co., 1963. 
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Before we state the axioms for area, we will make a few remarks about the collection of 
sets in the plane to which an area can be assigned. These sets will be called measurable 
sets; the collection of all measurable sets will be denoted by J { , The axioms contain enough 
information about the sets in Jt to enable us to prove that all geometric figures arising in 
the usual applications of calculus are in ^t and that their areas can be calculated by integra- 
tion. 

One of the axioms (Axiom 5) srates that every rectangle is measurable and that its area 
is the product of the lengths of its edges. The term “rectangle” as used here refers to any 
set congruentf to a set of the form 


{(*, y) | o < x < h, o < y < k). 


where h > 0 and k > 0. The numbers h and k are called the lengths of the edges of the 
rectangle. We consider a line segment or a point to be a special case of a rectangle by 
allowing h or k (or both) to be zero. 



A step region 


Figure 1.14 



Ordinate set 

(a) 



Inner step region 

(b) 



step region 

(C) 


figure 1.15 An ordinate set enclosed by two step regions. 


From rectangles we can build up more complicated sets. The set shown in Figure 1.14 
is the union of a finite collection of adjacent rectangles with their bases resting on the x-axis 
and is called a step region. The axioms imply that each step region is measurable and that 
its area is the sum of the areas of the rectangular pieces. 

The region Q shown in Figure 1 , 1 5(a) is an example of an ordinate set. Its upper boundary 
is the graph of a nonnegative function. Axiom 6 will enable us to prove that many ordinate 
sets are measurable and that their areas can be calculated by approximating such sets by 
inner and outer step regions, as shown in Figure 1.15(b) and (c). 

We turn now to the axioms themselves. 


axiomatic definition of AREA. We assume there exists a class ft of measurable sets 
in the plane and a set function a, whose domain is Jt, with the following properties: 

1. Nonnegative property. For each set S in Jt , we have a(S) > 0. 


f Congruence is used here in the same sense as in elementary Euclidean geometry. Two sets are said to be 
congruent if their points can be put in one-to-one correspondence in such a way that distances are preserved. 
That is, if two points p and q in one set correspond to p’ and q' in the other, the distance from p to q must 
be equal to the distance from p' to q'; this must be true for all choices of p and q, 
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2. Additive property. If S and Tare in Jf , then S U 7 and S O Tare in Jf ■. and H'd have 

a(S U T) = a(S) + a(T) - a(S O T) . 

3. Difference property. If S and Tare in Jf with S T, then 7 — S is in Jf , and we have 
a(T - S) = a(T) - a(S). 

4. Invariance under congruence. If a set S is in Jf and if T is congruent to S, then T is also 
in Jf and we have a(S) = a(T). 

5. Choice of scale. Every rectangle R is in Jf . I f the edges of R have lengths h and k, 

then a(R) = hk. 

6. Exhaustion property. Let Q be a set that can be enclosed between two step regions 
S and T, SO that 

(1.1) S £ Q c J. 

If there is one and only one number c which satisfies the inequalities 

a(S) < c < a(T) 

for all step regions S and T satisfying (1 .1), then Q is measurable and a (Q) = c. 

Axiom 1 simply states that the area of a plane measurable set is either a positive number 
or zero. Axiom 2 tells us that when a set is formed from two pieces (which may overlap), 
the area of the union is the sum of the areas of the two parts minus the area of their inter- 
section. In particular, if the intersection has zero area, the area of the whole is the sum of 
the areas of the two parts. 

If we remove a measurable set S from a larger measurable set T, Axiom 3 states that the 
remaining part, T — S, is measurable and its area is obtained by subtraction, a(T ~ S) = 
a{T) — a(S). In particular, this axiom implies that the empty set 0 is measurable and has 
zero area, Since a(T — S) > 0, Axiom 3 also implies the monotone property: 

a (S) < a(T), for sets S and T in Jf with S c T. 

In other words, a set which is part of another cannot have a larger area, 

Axiom 4 assigns equal areas to sets having the same size and shape. The first four 
axioms would be trivially satisfied if we assigned the number 0 as the area of every set in 
Jf . Axiom 5 assigns a nonzero area to some rectangles and thereby excludes this trivial 
case. Finally, Axiom 6 incorporates the Greek method of exhaustion; it enables us to 
extend the class of measurable sets from step regions to more general regions. 

Axiom 5 assigns zero area to each line segment. Repeated use of the additive property 
shows that every step region is measurable and that its area is the sum of the areas of the 
rectangular pieces. Further elementary consequences of the axioms are discussed in the 
next set of exercises. 
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1.7 Exercises 

The properties of area in this set of exercises are to be deduced from the axioms for area stated 
in the foregoing section. 

1. Prove that each of the following sets is measurable and has zero area: (a) A set consisting of a 
single point, (b) A set consisting of a finite number of points in a plane, (c) The union of a 
finite collection of line segments in a plane. 

2. Every right triangular region is measurable because it can be obtained as the intersection of 
two rectangles. Prove that every triangular region is measurable and that its area is one half 
the product of its base and altitude. 

3. Prove that every trapezoid and every parallelogram is measurable and derive the usual formulas 
for their areas. 

4. A point (x, y) in the plane is called a lattice point if both coordinates x and y are integers. Let 
P be a polygon whose vertices are lattice points. The area of P is Z + \B — 1, where Z denotes 
the number of lattice points inside the polygon and B denotes the number on the boundary. 

(a) Prove that the formula is valid for rectangles with sides parallel to the coordinate axes. 

(b) Prove that the formula is valid for right triangles and parallelograms. 

(c) Use induction on the number of edges to construct a proof for general polygons. 

5. Prove that a triangle whose vertices are lattice points cannot be equilateral. 

[Hint: Assume there is such a triangle and compute its area in two ways, using 
Exercises 2 and 4.] 

6 . Let A = {1,2, 3, 4, 5}, and let./? denote the class of all subsets of A. (There are 32 altogether, 
counting A itself and the empty set 0 ,) For each set S in J(, let n(S) denote the number of 
distinct elements in S. If S = {1, 2, 3, 4) and T = (3, 4, 5}, compute n(S u T), n(S a T), 
n(S — T), and n{T — S). Prove that the set function n satisfies the first three axioms for area. 


1.8 Intervals and ordinate sets 


In the theory of integration we are concerned primarily with real functions whose domains 
are intervals on the x-axis. Sometimes it is important to distinguish between intervals 
which include their endpoints and those which do not. This distinction is made by introducing 
the following definitions. 


a 


a < x < b 
Closed 


o- 

a 


o- 

a 


a < x < b 

Open 


a < x < b 

Half-open. 


a < x < b 

Half-open 


Figure 1.16 Examples of intervals. 


If a < b, we denote by [a, b] the set of all x satisfying the inequalities a < x < b and 
refer to this set as the closed interval from a to b. The corresponding open interval, written 
(a, b), is the set of all x satisfying a < X < b. The closed interval [a, b] includes the end- 
points a and b, whereas the open interval does not. (See Figure 1.16.) The open interval 
(a, b) is also called the interior of [a, b]. Half-open intervals (a, b] and [a, b), which include 
just one endpoint are defined by the inequalities a <x < b and a < x < b, respectively. 

Let f be a nonnegative function whose domain is a closed interval [a, b]. The portion 
of the plane between the graph off and the x-axis is called the ordinate set of f. More 
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precisely, the ordinate set off is the collection of all points (x, }’) satisfying the inequalities 

a < x < b , 0<y< f(x) . 

In each of the examples shown in Figure 1.17 the shaded portion represents the ordinate 
set of the corresponding function. 

Ordinate sets are the geometric objects whose areas we want to compute by means of the 
integral calculus. We shall define the concept of integral first for step functions and then 
use the integral of a step function to formulate the definition of integral for more general 



a 


b 



Figure 1.17 Examples of ordinate sets. 


functions. Integration theory for step functions is extremely simple and leads in a natural 
way to the corresponding theory for more general functions. To start this program, it is 
necessary to have an analytic definition of a step function. This may be given most simply 
in terms of the concept of a partition, to which we turn now. 


1.9 Partitions and step functions 

Suppose we decompose a given closed interval [a, b] into n subintervals by inserting 
n — 1 points of subdivision, say x 1 , x 2 , • • . , X n _i , subject only to the restriction 

(1.2) d<Xi<X 2 < • • • < W-t < b . 

It is convenient to denote the point a itself by x 0 and the point b by x n . A collection of 
points satisfying ( 1 .2) is called a partition P of [a, b], and we use the symbol 

P = {x 0 ,x X n } 

to designate this partition. The partition P determines n closed subintervals 

[x 0 , *iL l*i ,x 2 ],, . . , [x„~ i , x„] . 

A typical closed subinterval is , x,], and it is referred to as the kth closed subinterval 
of P; an example is shown in Figure 1.18. The corresponding open interval {x k _ x , x k ) is 
called the kth open subinterval of P. 

Now we are ready to formulate an analytic definition of a step function. 
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a = X 0 X| x 2 • • i X/t - X; ■ > ■ x„_i x„ - j, 

Figure 1.18 An example of a partition of [a, b]. 


DEFINITION OF A STEP function. A fiinction s, whose domain is a closed interval fa» b], 

is called a step function if there is a partition P - {x 0 , x 1 , ... , x,} of [a, b] such that s is 
constant on eachopen subinterval ofP. That is to say, for each k = 1 , 2, ..., n, there is 
a real number s k such that 

s(x) = s k if x k _ k <x <X k , 

Step functions are sometimes calledpiecewise constant functions. 

Note: At each of the endpoints and x k the function must have some well-defined 

value, but this need not be the same as s k . 

example. A familiar example of a step function is the “postage function,’' whose graph 
is shown in Figure 1.19. Assume that the charge for first-class mail for parcels weighing 
up to 20 pounds is 5 cents for every ounce or fraction thereof. The graph shows the number 
of 5-Cent stamps required for mail weighing up to 4 ounces. In this case the line segments 
on the graph are half-open intervals containing their right endpoints. The domain of the 
function is the interval [0, 320], 

From a given partition P of [a, b], we can always form a new partition P' by adjoining 
more subdivision points to those already in P. Such a partition P' is called a rejinement 
of P and is said to be finer than P. For example, P = {0, 1, 2, 3, 4) is a partition of the 
interval [0, 4]. If we adjoint the points 3/4, V 2, and 7/2, we obtain a new partition P' of 



4 2 

Figure 1.19 The postage function. Figure 1.20 A partition P of [0, 4] and a 

refinement p\ 
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[0, 4], namely, P' = {0, 3/4, 1, \/2, 2, 3, 7/2, 4}, which is a refinement of P. (See Figure 
1.20.) If a step function is constant on the open subintervals of P, then it is also constant 
on the open subintervals of every refinement P'. 


1.10 Sum and product of step functions 

New step functions may be formed from given step functions by adding corresponding 
function values. For example, suppose s and t are step functions, both defined on the 
Same interval [a, b]. Let P 1 and P 2 be partitions of [a, b ] such that s is constant on the open 
subintervals of P x and t is constant on the open subintervals of P 2 . Let u = S + t be the 
function defined by the equation 

u(x) = s(x) + t(x) if a < x < b . 


Graph of s Graph of t 


i- 


a jc, b it x\ b a x[ jq b 

Figure 1.21 The sum of two step functions. 

To show that u is actually a step function, we must exhibit a partition P such that u is 
constant on the open subintervals of P. For the new partition P, we take all the points of 
P l along with all the points of P 2 . This partition, the union of P x and P 2 , is called the 
common rejnement of P x and P 2 . Since both s and t are constant on the open subintervals 
of the common refinement, the same is true of u. An example is illustrated in Figure 1.21. 
The partition P x is (a, x 1 , b}, the partition P 2 is {a, x[ , b}, and the common refinement is 
{a, x x , b }. 

Similarly, the product v = S 1 t of two step functions is another step function. An 
important special case occurs when one of the factors, say t, is constant throughout [a, b]. 
If t(x) = c for each x in [a, b], then each function value v(x) is obtained by multiplying the 
step function s(x) by the constant c. 

1.11 Exercises 

In this set of exercises, [x] denotes the greatest integer < x 

1. Let f(x) = [x] and letg(x) = [2x] for all real x. In each case, draw the graph of the function 
h defined over the interval [ — 1, 2] by the formula given. 

(a) h(x) = f(x) + gix). (c) h(x) = f(x)g(x). 

(b) h(x) = fix) + g{xj2). (d) h(x) = if(2x) S (x!2). 

2. In each case, /is a function defined over the interval [ -2, 2] by the formula given. Draw the 
graph off. If/is a step function, find a partition P of [ -2, 2] such thatfis constant on the 
open subintervals of P, 


Graph of j + 1 
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(a) fix) = x + [x], (dj fix) = 2[x], 

(b) /(x) = x - [x]. (ej fix') = [x + £]. 

(c) fix) = [-x], (f) fix) = [xl + [x + £]. 

3. In each case, sketch the graph of the functionfdefined by the formula given. 

(a) fix) = [v'x] for 0 < x < 10. (c) f(x) = V[x] for 0 < x < 1 0. 

(b) f(x) = [x 2 ] for 0 < x < 3. (d) fix) = [x] 2 for 0 < x < 3. 

4. Prove that the greatest-integer function has the properties indicated. 

(a) [x + n] = [x] -j- n for every integer n. 

— fx] if x is an integer, 

-M — 1 otherwise. 

(c) [x + /] = [x] + [y] or [x] + [y] + 1 
(dj [2x] = [x] + [x + £]. 

(e) [3x] = [xl + [x + l] + [x + §]. 

Optional exercises. 

The formulas in Exercises 4(d) and 4(e) suggest a generalization for [nx ]. State and prove 
Such a generalization. 

Recall that a lattice point (x, y) in the plane is one whose coordinates are integers. Letfbe a 
nonnegative function whose domain is the interval [a, b], where a and b are integers, a < b. 
Let S denote the set of points (x, y) satisfying a <x <6, 0 < y < fix). Prove that the number 
of lattice points in S is equal to the sum 


5. 

6 . 


2 


7i= a 


[firfl 


7. If a and b are positive integers with no common factor, we have the formula 



(a - 1 )(b ~ 1) 
2 


When 6 = 1, the sum on the left is understood to be 0. 

(a) Derive this result by a geometric argument, counting lattice points in a right triangle. 

(b) Derive the result analytically as follows: By changing the index of summation, note that 

[na/b] = 'Ui W6 - n)/b]. Now apply Exercises 4(a) and (b) to the bracket on the 

right. 

8. Let S be a set of points on the real line. The characteristic function of S is, by definition, the 
function such that /g(x) = 1 for every x in 5, and xsi x ) = 0 f° r those x not in S. Let / be 
a step function which takes the constant value c k on the kth open subinterval I k of some partition 
of an interval [a, b]. Prove that for each x in the union f u / 2 u . U /„ we have 


n 

fix) = 2 c h X Ik ix). 

k = 1 


This property is described by saying that every step function is a linear combination of char- 
acteristic functions °f intervals. 


1.12 The definition of the integral for step functions 

In this section we introduce the integral for step functions. The definition is constructed 
so that the integral of a nonnegative step function is equal to the area of its ordinate set. 
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Let s be a step function defined on [a, b], and let P = {x, , , . . . , x n } be a partition of 

[a, b ] such that s is constant on the open subintervals of P. Denote by s k the constant value 
that s takes in the kth open subinterval, so that 

s(x) = s k if x k __ k < x < x k , k= 1, 2 

definition of the integral of step functions. The integral of s from a to b, denoted 
by the symbol s(x) dc, is defined by the following formula: 

Cl) n 

(1.3) s(x ) dx = 2 V (Xfc “ x*-i) • 

That is to say, to compute the integral, we multiply each constant value s k by the length of 
the kth subinterval, and then we add together all these products. 

Note that the values of s at the subdivision points are immaterial since they do not appear 
on the right-hand side of (1.3). In particular, ifs is constant on the open interval ( a > b), say 
s(x) = c if a < x < b, then we have 

ft ” 

j a s(x) dx = c2,(x k — x^f = c(b — a) , 

regardless of the values s(a) and s(b). If c > 0 and if s(x) = c for all x in the closed interval 
[a, b], the ordinate set of s is a rectangle of base b — a and altitude c; the integral of s is 
c(b — a), the area of this rectangle. Changing the value of s at one or both endpoints a or b 
changes the ordinate set but does not alter the integral of s or the area of its ordinate set. 
For example, the two ordinate sets shown in Figure 1.22 have equal areas. 



Figure 1.22 Changes in function values at two 
points do not alter area of ordinate set. 


Figure 1 . 23 The ordinate set of a 
step function. 


The ordinate set of any nonnegative step function s consists of a finite number of rect- 
angles, one for each interval of constancy; the ordinate set may also contain or lack certain 
vertical line segments, depending on how s is defined at the subdivision points. The integral 
of s is equal to the sum of the areas of the individual rectangles, regardless of the values s 
takes at the subdivision points. This is consistent with the fact that the vertical segments 
have zero area and make no contribution to the area of the ordinate set. In Figure 1.23, 
the step function s takes the constant values 2, 1, and f in the open intervals (1, 2), (2, 5), 
and (5, 6), respectively. Its integral is equal to 

if s(x) dx =2 (2 - 1) + 1 • (5 - 2) + | ■ (6 - 5) = “ . 
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It should be noted that the formula for the integral in (1.3) is independent of the choice of 
the partition P as long as 5 is constant on the open subintervals of P. For example, suppose 
we change from P to a finer partition P' by inserting exactly one new subdivision point t, 
where x 0 < t < x v Then the first term on the right of (1.3) is replaced by the two terms 
• (t — x„) and .s'j 1 (x, — t), and the rest of the terms are unchanged. Since 

■Si ' (t - *0) + ■*! • Oi — t)=s i ■ (x x - x 0 ) , 

the value of the entire sum is unchanged. We can proceed from P to any finer partition P' 
by inserting the new subdivision points one at a time. At each stage, the sum in (1.3) 
remains unchanged, so the integral is the same for all refinements of P . 

1.13 Properties of the integral of a step function 

In this section we describe a number of fundamental properties satisfied by the integral 
of a step function. Most of these properties seem obvious when they are interpreted 
geometrically, and some of them may even seem trivial. All these properties carry over 
to integrals of more general functions, and it will be a simple matter to prove them in the 
general case once we have established them for step functions. The properties are listed 
below as theorems, and in each case a geometric interpretation for nonnegative step functions 
is given in terms of areas. Analytic proofs of the theorems are outlined in Section 1.15. 



Figure 1.24 Illustrating the additive property of the integral. 


The first property states that the integral of a sum of two step functions is equal to the 
sum of the integrals. This is known as the additive property and it is illustrated in Figure 
1.24. 

THEOREM 1.2. ADDITIVE PROPERTY, 

f [s(x) + f(x)] dx = ( s(x) dx + f 6 t(x) dx . 

J ct Ja Ja 

The next property, illustrated in Figure 1.25, is called the homogeneous property. It 
states that if all the function values are multiplied by a constant c, then the integral is also 
multiplied by c. 

theorem 1.3. homogeneous property. For every real number c, we have 

1 6 c . s(x) dx = c f* s(x) dx . 

These two theorems can be combined into one formula known as the linearity property. 
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2s 



a b a b 

Figure 1.25 Illustrating the homogeneous property of the integral (with c = 2). 


theorem 1.4. linearity property. F or every reul and c 2 , w e have 

P [cis(x) + c 2 t(x)] dx = c x P s(x) dx + c 2 P t(x) dx 

da d a J a 

Next, we have a comparison theorem which tells us that if one step function has larger 
values than another throughout [a, b], its integral over this interval is also larger. 

THEOREM 1.5. COMPARISON THEOREM. If s(x) < t(x) for every x in [a, b], then 

P s(x) dx < P t(x) dx . 

da d a 

Interpreted geometrically, this theorem reflects the monotone property of area. If the 
ordinate set of a nonnegative step function lies inside another, the area of the smaller region 
is less than that of the larger. 

The foregoing properties all refer to step functions defined on a common interval. The 
integral has further important properties that relate integrals over different intervals. 
Among these we have the following. 

THEOREM 1.6. ADDITIVITY WITH RESPECT TO THE INTERVAL OF INTEGRATION. 

P i(x) dx + P s(x) dx = P s(x) dx if a < c < b . 

da dc da 

This theorem reflects the additive property of area, illustrated in Figure 1 .26. If an ordinate 
set is decomposed into two ordinate sets, the sum of the areas of the two parts is equal to 
the area Of the whole. 

The next theorem may be described as invariance under translation. If the ordinate set 
of a step function S is “shifted” by an amount c, the resulting ordinate set is that of another 
step function t related to J by the equation t(x) = s(x — c). Ifs is defined on [a, b], then 
t is defined on [a + c, b + c], and their ordinate sets, being congruent, have equal areas. 
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Figure 1 .26 Additivity with respect 
to the interval of integration. 


a b a + c b + c 

Figure 1.27 Illustrating invariance of the 
integral under translation: t(x) = s(x — c). 


This property is expressed analytically as follows: 


THEOREM 1.7. INVARIANCE UNDER TRANSLATION. 


f b s(x) dx = 6+c s(x — c) dx 

■la -la+c 


for every real c 


Its geometric meaning is illustrated in Figure 1.27 for c > 0. When c < 0, the ordinate 
set is shifted to the left. 

The homogeneous property (Theorem 1.3) explains what happens to an integral under a 
change of scale on the y-axis. The following theorem deals with a change of scale on the 
x-axis. If s is a step function defined on an interval [a, b] and if we distort the scale in the 
horizontal direction by multiplying all x-coordinates by a factor k > 0, then the new graph 
is that of another step function t defined on the interval [ka, kb] and related to s by the 
equation 




if 


ka < x <kb . 


An example with k = 2 is shown in Figure 1.28 and it suggests that the distorted figure has 
an area twice that of the original figure. More generally, distortion by a positive factor k 


I 


x, b 2a 2x, 

Figure 1.28 Change of scale on the x-axis: t(x) = s(xj 2). 


26 


has the effect of multiplying the integral by k. Expressed analytically, this property assumes 
the following form : 


THEOREM 1.8. EXPANSION OR CONTRACTION OF THE INTERVAL OF INTEGRATION. 

fM’ ( x \ P 

j s I - I dx = k\ s(x) dx for every k > 0 . 

Jka \k/ Ja 

Until now, when we have used the symbol , it has been understood that the lower limit 
a was less than the upper limit b. It is convenient to extend our ideas somewhat and consider 
integrals with a lower limit larger than the upper limit. This is done by defining 


(1.4) 


a s(x ) dx - — S s(x ) dx if a < b . 

d b da 
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We also define 

f *(x) dx = 0 , 

J a 

a definition that is suggested by putting a = b in (1.4). These conventions allow us to con- 
clude that Theorem 1.6 is valid not only when c is between a and b but for any arrangement 
of the points a, b, c. Theorem 1.6 is sometimes written in the form 

s(x) dx + j* s(x) dx + jj s(x) dx = 0 . 

Similarly, we can extend the range of validity of Theorem 1.8 and allow the constant k to 
be negative. In particular, when k = — 1, Theorem 1.8 and Equation (1.4) give us 


•C s ( x ) dx ~ j:“ s( -x) dx , 



Figure 1.29 Illustrating the reflection property of the integral. 


We shall refer to this as the reflection property of the integral, since the graph of the function 
t given by t{x ) = s(— x) is obtained from that of s by reflection through the y-axis. An 
example is shown in Figure 1.29. 

1.14 Other notations for integrals 

The letter X that appears in the symbol 5(x) dx plays no essential role in the definition 
of the integral. Any other letter would serve equally well. The letters t, u, v, z are frequently 
used for this purpose, and it is agreed that instead of j-(.v) dx we may write s( t) dt, 
s{u) du, etc., all these being considered as alternative notations for the same thing. The 
symbols x, t, u, etc. that are used in this way are called “dummy variables.” They are 
analogous to dummy indices used in the summation notation. 

There is a tendency among some authors of calculus textbooks to omit the dummy 
variable and the (/-symbol altogether and to write simply s for the integral. One good 
reason for using this abbreviated symbol is that it suggests more strongly that the integral 
depends only on the function s and on the interval [a, b]. Also, certain formulas appear 
simpler in this notation. For example, the additive property becomes (j + () = j b a s + 
JJ' t. On the other hand, it becomes awkward to write formulas like Theorems 1.7 and 
1.8 in the abbreviated notation. More important than this, we shall find later that the 
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original Leibniz notation has certain practical advantages. The symbol dx, which appears 
to be rather superfluous at this stage, turns out to be an extremely useful computational 
device in connection with many routine calculations with integrals. 


1.15 Exercises 

1. Compute the value of each of the following integrals. You may use the theorems of Section 

1.13 whenever it is convenient 1.0 doso.The notation [x] denotes the greatest integer <x 

(a) J 3 i [x] dx. (d) jp 2[x] dx. 

(b) £ [x + |] dx. (e) £ [2 x]dx. 

(c) £ ([*] + [x + *]) dx. (0 j* [ -x] dx. 

2. Give an example of a step function s, defined on the closed interval [0, 5], which has the 
following properties: J 2 s(x) dx = 5, J - ® s(x) dx = 2. 

3. Show that j® [x] dx + J® [— x] dx = a b. 

4. (a) If n is a positive integer, prove that [ t ] dt = n(n — l)/2. 

(b) If/(x) = [t] dt forx > 0, draw the graph offover the interval [0,4], 

5. (a) Prove that Jjj [r 2 ] dt = 5 — -y/2 — \/3. 

(b) Compute j 3 _ 3 [t 2 ] dt. 

6. (a) If n is a positive integer, prove that J” [t] 2 dt = n{n — 1)(2 n — l)/6. 

(b) If f(x) = j* [?] 2 dt for x > 0, draw the graph offcver the interval [0, 3], 

(c) Find all x > 0 for which J* [;] 2 dt = 2(x — 1). 

7. (a) Compute [V t] dt. 

(b) If n is a positive integer, prove that J” 2 [ v 7 t\ dt = n(n — 1)(4« + l)/6. 

8. Show that the translation property (Theorem 1.7) may be expressed in the equivalent form 

P +c f(x) dx = P f(x + c) dx . 

J a+c Ja 

9. Show that the following property is equivalent to Theorem 1.8 : 

f kb f{x) dx = k ( b f(kx) dx . 
j ka Jh 

10. Given a positive integer p. A step function s is defined on the interval [0, p] as follows: 
i(x) = ( — 1 )"m if x lies in the interval n<x<n + l,wheren = 0, 1, 2, — 1 ; s(p) = 0. 
Let f(p) = ] v o six) dx. 

(a) Calculate /( 3), f (4), and /(/( 3)). 

(b) For what value (or values) ofp is \f(p)\ = 7? 

11. If, instead of defining integrals of step functions by using formula (1.3), we used the definition 


P.f(x) dx = J s\ • (x k - x k j) , 

k = 1 


a new and different theory of integration would result. Which of the following properties would 
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remain valid in this new theory? 

(a) f t s+ f C s = f C s. 

J a J b J a 

(b) f 6 (5+ t)= f 6 S + f*/. 

Ja Ja Ja 


(c) I b C ■ S = C P . 5 . 

J a J a 

(d) f 6 1 s(x) dx = P s(x + c) dx. 

Ja+c J a 

(e) If s(x) < t(x) for each x in [a, b], then I s < L t. 

J a J a 


12. Solve Exercise II if we use the definition 


6 s(x) dx = 2 s k ■ (x\ - . 

J a k=l 


Analytic proofs of the properties of the integral given in Section 1.13 are requested in the 
following exercises. The proofs of Theorems 1.3 and 1.8 are worked out here as samples. 
Hints are given for the others. 

Proof of Theorem 1.3 : c . s(x) dx = c s(x) dx for every real c. 

LetP = {x 0 ,x 1 ,...,x,l he a partition of [a, b\ such that s is constant on the open subintervals 
Of P. Assumes(x) = s k if x^ <x < x k (k = 1,2,..., «). Then c . s(x) = c ' S k if X k _i < 
x<x k , and hence by the definition of an integral we have 


(*b ^ ^ /*b 

C S(x) dx = 2 c s k . (x k - x k _ x ) = c y S k (x fc - x n) = c s(x)dx . 

Ja jfc= L k=f Ja 

Proof of Theorem E8 : 


J s^jdx = k J* six) dx if k > 0. 

Let P = {X|j be a partition of the interval [a. b] such that s is constant on the 

open subintervals of P. Assume that s(x) = if X { _ k < x < x t . Let t(x) = s(x/k) if ka < 
x < kb. Then t(x) = ,v t if x lies in the open interval ( kx t _ x , kx,); hence P’ = [kx u , kx 1 , , 

kx n } is a partition of [ka, kb] and t is constant on the open subintervals of P\ Therefore t is 
a step function whose integral is 


f* t(x) dx = 2 s r {kx- - kx t _f ■ 

J ka 


k t s > 

1=1 


(Xi - JCf-i) = k 


six) dx 

a a 


13. Prove Theorem 1.2 (the additive property). 

[Hint: Use the additive property for sums: = ^Li a k + 2"=i h ■] 

14. Prove Theorem 1.4 (the linearity property). 

[Hint: Use the additive property and the homogeneous property.] 

15. Prove Theorem 1.5 (the comparison theorem). 

[Hint: Use the corresponding property for sums: 2- i a k < 2Li h if a k < b k for 

k = 1, 2, .... n.] 
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16. Prove Theorem 1.6 (additivity with respect to the interval). 

[Hint: If P 1 is a partition of [a, c] and P 2 a partiton of [c, b ], then the points of along 

with those of P 2 form a partition of [a, b].] 

17. Prove Theorem 1.7 (invariance under translation). 

[Hint: If P = {x 0 , x 1 > ■ ■ ■ / x n } is a partition of [a, b], then P' = {x 0 + c, x x + c, . . . , 

X n + c} is a partition of [a + c, b + c].] 

1.16 The integral of more general functions 

The integral s(x) dx has been defined when s is a step function. In this section we shall 
formulate a definition of J„/(x) dx that will apply to more general functions f. The 
definition will be constructed so that the resulting integral has all the properties listed in 
Section 1.13. 



a b 

Figure 1.30 Approximating a function f from above and below by step functions. 

The approach will be patterned somewhat after the method of Archimedes, which was 
explained above in Section 1 1.3. The idea is simply this: We begin by approximating the 
function / from below and from above by step functions, as suggested in Figure 1.30. 
That is, we choose an arbitrary step function, say s. whose graph lies below that off, and a 
arbitrary step function, say t, whose graph lies above that of f. Next, we consider the 
collection of all the numbers s(x) dx and t(x) dx obtained by choosing 5 and t in all 
possible ways. In general, we have 

6 s(x) dx < ] t(x) dx 

because of the comparison theorem. If the integral of/is to obey the comparison theorem, 
then it must be a number which falls between JjJ s(x) dx and t(x) dx for every pair of 
approximating functions s and t. If there is only one number which has this property 
we define the integral off to be this number. 

There is only one thing that can cause trouble in this procedure, and it occurs in the very 
first step. Unfortunately, it is not possible to approximate every function from above 
and from below by step functions. For example, the functionfgiven by the equations 

/(x )=i if m=o, 
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is defined for all real x, but on any interval [a, b] containing the origin we cannot surround 
/by step functions. This is due to the fact that /has arbitrarily large values near the origin 
or, as we say, /is unbounded in every neighborhood of the origin (see Figure 1.31). There- 
fore, we shall first restrict ourselves to those functions that are bounded on [a, b], that is, to 
those functions / for which there exists a number M > 0 such that 

(1.5) —M <f[x) < M 

for every x in [a, b ]. Geometrically, the graph of such a function lies between the graphs 
of two constant step functions s and t having the values — M and -\-M, respectively. (See 


y 



figure 1.31 An unbounded function. figure 1.32 A bounded function. 

Figure 1.32.) In a case like this, we say that / is bounded by M. The two inequalities in 
(1.5) can also be written as 

With this point taken care of, we can proceed to carry out the plan described above and 
to formulate the definition of the integral. 


definition of the integral of A bounded function. Let f be a function defined and 
bounded on [a, b]. Let sand t denote arbitrary step functions defined on [a, b] such that 

0- 6 ) j(x) < f(x) < t (x) 

for every X in [a, b]. If there is OtlC and only one number I such that 

(1.7) /Vx)dx t(x) dx 


for every pair of step functions s and t satisfying (1.6), then this number I is called the 
integral off from a to b, and is denoted by the symbol § b a f(x) dx or by § b a f. When such 
an Z exists, the function f is said to be integrable on [a, b]. 
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If a < b, vie define j’“/(x) dx = - f a f(x) dx, provided / is integrable on [a, b]. We 
also define J“/(x) dx = 0. If/ is integrable on [a, b], vie say that the integral flf(x) dx 
exists. The function f is called the integrand, the numbers a and b are called the limits of 

integration, and the interval fa, b] the interval of integration. 

1.17 Upper and lower integrals 

Assume f is bounded on fa, b\. Ifs and t are step functions satisfying (1.6), we say s is 
below f and t is above f and we write s < / < t. 

Let S denote the set of all numbers J* s(x) dx obtained as s runs through all step functions 
below f and let T be the set of all numbers t(x) dx obtained as t runs through all step 
functions above f. That is, let 

S = | s(x) dx | s < / j , T= | t(x) dx \ f < tj . 

Both sets Sand Tare nonempty sincef is bounded. Also, s(x) dx < t(x) dx if s </ K t, 
so every number in S is less than every number in T. Therefore, by Theorem 1.34, S has 
a supremum, and T has an infimum, and they satisfy the inequalities 

P s(x) dx < sup S < inf T < f 6 t(x) dx 

Ja "" Jet 

for all s and t satisfying S< f K.t. This shows that both numbers sup S and inf T satisfy 
(1.7). Therefore, f is integrable on fa, b] if and only if sup S =inf T, in which case we have 

P/(x) dx = sup S = inf T. 

d<l 

The number sup S is called the lower integral off and is denoted by 1(f). The number 
inf T is called the upper integral off and is denoted by / f f ) , Thus, we have 

1(f) = sup | P s(x) dx | 5 < /j , 1(f) = inf Jj^ t(x) dx \ f < (| . 

The foregoing argument proves the following theorem. 

theorem 1.9. Evety function f which is bounded on fa, b] has a lower integral J ( f ) and 
an upper integral iff) satisfying the inequalities 

P s(x) dx < 1(f) < /(/) < P t(x) dx 

J a J a 

for a ll step functions sand t with s<f <t. The function f is integrable on fa, b] ifand only 
if its upper and lower integrals are equal, in which case we have 

/VW dx = 1(f) = 1(f ) , 
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1.18 The area Of an ordinate set expressed as an integral 

The concept of area was introduced axiomatically in Section 1.6 as a set function having 
certain properties. From these properties we proved that the area of the ordinate set of a 
nonnegative step function is equal to the integral of the function. Now we show that the 
Same is true for any integrable nonnegative function. We recall that the ordinate set of a 
nonnegative function / over an interval [a, b] is the set of all points (x, y) satisfying the 
inequalities 0 < y </(*), a <X < fai 

THEOREM 1 . 10. Let f be a nonnegative function, integrable on an interval fa b], and let 
Q denote the ordinate set off over [a, b]. Then Q is measurable and its area is equal to the 
integral J* / (x) dx. 

Proof. Let S and T be two step regions satisfying 5 £ Q <£ T. Then there are two step 
functions y and t satisfying sKf < t on [a, b ], such that 

a(S) = P s(x) dx and a(T) = p t(x) dx . 

Since f is integrable on [a, b], the number I = f (x) dx is the only number satisfying the 
inequalities 

P s(x) dx < I < P t(x) dx 

Ja 

for all step functions s and t with .y < / < t. Therefore this is also the only number satisfying 
a(S) < Z <a(T) for all step regions S and T with S £ Q £ f, By the exhaustion property, 
this proves that Q is measurable and that a(Q) = Z. 

Let Q denote the ordinate set of Theorem 1.10, and let Q’ denote the set that remains if 
we remove from Q those points on the graph of f . That is, let 

Q' = {{x, y)\a < x <b ,0 <y < f(x)} . 

The argument used to prove Theorem 1.10 also shows that Q’ is measurable and that 
a(Q') = a(Q). Therefore, by the difference property of area, the set Q — Q’ is measurable 
and 

a(Q - Q') = a(Q) - a(Q') = 0. 

In other words, we have proved the following theorem. 

theorem 1.11. Let f be a nonnegative function, integrable on an interval fa b]. Then 
the graph off, that is, the set 

{(x, }’) | a < x < b, y = fix)}, 

is measurable and has area equal to 0. 

1.19 Informal remarks on the theory and technique of integration 

Two fundamental questions arise at this stage: (1) Which boundedf unctions a re integrable? 
(2) Given that a function f is integrable^ how do we compute the integral off ? 
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The first question cornes under the heading “Theory of Integration” and the second under 
the heading “Technique of Integration.” A complete answer to question (1) lies beyond the 
scope of an introductory course and will not be given in this book. Instead, we shall give 
partial answers which require only elementary ideas. 

First we introduce an important class of functions known as monotonic jiunctions. In 
the following section we define these functions and give a number of examples. Then we 
prove that all bounded monotonic functions are integrable. Fortunately, most of the 
functions that occur in practice are monotonic or sums of monotonic functions, so the 
results of this miniature theory of integration are quite comprehensive. 

The discussion of “Technique of Integration” begins in Section 1.23, where we calculate 
the integral JjJ x v dx, whenp is a positive integer. Then we develop general properties of the 
integral, such as linearity and additivity, and show how these properties help us to extend 
our knowledge of integrals of specific functions. 


1.20 Monotonic and piecewise monotonic functions. Definitions and examples 


A function / is said to be increasing on a set S if f (x) < f(y) for every pair of points x 
and y in S with x < y. If the strict inequality f(x) <f(y) holds for all x < y in S, the 
function is said to be strictly increasing on S. Similarly, f is called decreasing on S if 




Strictly increasing 


Figure 1.33 Monotonic functions. 



Strictly decreasing 


f(x) > f(y) for all x < y in S. If f(x) > f(y) for all x < y in S, then / is called strictly 
decreasing on S. A function is called monotonic on S if it is increasing on S or if it is de- 
creasing on S. The term Strictly monotonic means thatfis strictly increasing on S or strictly 
decreasing on S. Ordinarily, the set S under consideration is either an open interval or a 
closed interval. Examples are shown in Figure 1.33. 



Figure 1.34 A piecewise monotonic function. 
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A function f is said to be piecewise monotonic on an interval if its graph consists of a 
finite number of monotonic pieces. That is to say, fis piecewise monotonic on [a, b] if 
there is a partition P of [a, b ] such that f is monotonic on each of the open subintervals of 
P. In particular, step functions are piecewise monotonic, as are all the examples shown in 
Figures 1.33 and 1.34. 


example 1. The power functions. If p is a positive integer, we have the inequality 

x v < y” if 0 <x<y, 

which is easily proved by mathematical induction. This shows that the power function/, 
defined for all real x by the equation f(x) = x v , is strictly increasing on the nonnegative 
real axis. It is also strictly monotonic on the negative real axis (it is decreasing ifp is even 
and increasing ifp is odd). Therefore, / is piecewise monotonic on every finite interval. 


example 2. The square-root function. Let/f.r) = V~X f or x > 0. This function is strictly 
increasing on the nonnegative real axis. In fact, if 0 < x < y, we have 


hence, Vy — Vx > 0. 


Vy - V* = 


v — X _ 

Vy + \/x’ 


example 3. The graph of the function g defined by the equation 


g(x) = V r 2 — X 2 if -r < x < r 

is a semicircle of radius Y. This function is strictly increasing on the interval -r < x <0 
and strictly decreasing on the interval 0 < x < r. Hence, g is piecewise monotonic on 
[~r, r]. 


1.21 Integrability of bounded monotonic functions 

The importance of monotonic functions in integration theory is due to the following 
theorem. 


theorem 1 . 12 . If f is monotonic on a closed interval [a. b], then f is integrable on [a. b]. 

Proof. We shall prove the theorem for increasing functions. The proof for decreasing 
functions is analogous. Assume / is increasing and let 1(f) and 1(f) denote its lower and 
upper integrals, respectively. We shall prove that 1(f) = 1(f). 

Let n be a positive integer and construct two special approximating step functions S n and 
t n as follows: Let P = {x„ x lf . . . , x,} be a partition of [a, b] into n equal subintervals, that 
is, subintervals [vy.j , y a .] with x k — X k _i = (b — a)ln for each k. Now define s n and t n by 
the formulas 


JnC*) = /(**- i ) , tjx) = f(x k ) if < X < x k . 
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At the subdivision points, define s n and t n so as to preserve the relations s n (x) < f(x) < 
t n (x) throughout [a, b]. An example is shown in Figure 1.35(a). For this choice of step 
functions, we have 


Cb 'b n n 

*« - s « =2f(x k )(x k - x k _J - 2f(x k ^)(x k - Xj_j) 

«/<! vO fc=l 1 


b — a 


n 


71 

2 [/(**)“ /(*«)] 


(b~ a)[f(b) -f(a)] 


where the last equation is a consequence of the telescoping property of finite sums. This last 
relation has a simple geometric interpretation. The difference t n — s n is equal to the 
sum of the areas of the shaded rectangles in Figure 1.35(a). By sliding these rectangles to 
the right so that they rest on a common base as in Figure 1.35(b), we see that they fill out a 




Figure 1.35 Proof of integrability of an increasing function. 


rectangle of base ( b — a)jn and altitude f(b) -f(a); the sum of the areas is therefore 
C/m, where C = (b — a)[f(b) — f(a)]. 

Now we rewrite the foregoing relation in the form 

(i - 8 > MW- 

The lower and upper integrals off satisfy the inequalities 

f" s n < Kf) < f' t n and f" s n < 1(f) < t n . 

•'a Ja Ja -'a 

Multiplying the first set of inequalities by (- 1) and adding the result to the second set, we 
obtain 

1(f) ~ Kf) < f C - f , 

J a Ja 

Using (1.8) and the relation 1(f) < 1(f), we obtain 

o < 1(f) - 1 (f) < - 

n 
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for every integer n > 1. Therefore, by Theorem 1.31, we must have J(f) = 1(f). This 
proves thatfis integrable on [a, b]. 


1.22 Calculation of the integral of a bounded monotonic function 

The proof of Theorem 1.12 not only shows that the integral of a bounded increasing 
function exists, but it also suggests a method for computing the value of the integral. This 
is described by the following theorem. 


theorem 1.13. Assumef is increasing on a closed interval [a, b]. Let x k = a + k(b — a)/n 
fork — 0, 1,. . . ,n. If I is any number which satisfies the inequalities 


(1.9) 



< 1 < 





k = l 


for every integer n > 1, then Z = f(x) dx. 


Proof. Let s n and t n be the special approximating step functions obtained by subdivision 
of the interval [a, b] into n equal parts, as described in the proof of Theorem 1.12. Then, 
inequalities (1.9) state that 

[s n <I<[t n 


for every n > 1. But the integral f(x) dx satisfies the 
Equation (1.8) we see that 


0 < 




n 


same inequalities as Z. Using 


for every integer n > 1. Therefore, by Theorem 1.31, we have Z = j \f(x) dx, as asserted. 

An analogous argument gives a proof of the corresponding theorem for decreasing 
functions. 


theorem 1.14. Assume f is decreasing on [a, b]. Let x k = a + k(b — a)/nfor k = 

0,1,... . n. If Z is any number which satisfies the inequalities 

— 1/(**) < i < — y/(x.) 

n *-> n *-> 

1 *:= 0 

for every integer n > 1, then Z = f u f(x) dx. 

1.23 Calculation of the integral x p dx when p is a positive integer 

To illustrate the use of Theorem 1.13 we shall calculate the integral JJ| x p dx where 

b > 0 andp is any positive integer. The integral exists because the integrand is bounded 

and increasing on [0, b]. 
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theorem 1.15. Ifp is a positive integer and b > 0, we have 


f \*dx 


J<1 


b p+1 

p + 1 


Proof. We begin with the inequalities 



k=l 


k* < 


rt 3>+l 


P + 1 


<Z k 

K==l 


valid for every integer n > 1 and every integer p > 1. These inequalities may be easily 
proved by mathematical induction. (A proof is outlined in Exercise 13 of Section 14.10.) 
Multiplication of these inequalities by b v+1 jn' p+1 gives us 


V* 1 < b 

n ^-> \ n / p+1 n n / 

k = i it— i 

If we let fix) = x p and x k — kbjn, for k = 0, 1,2,..., n, these inequalities become 



b 

n 


]>/(**) ■ 
k = 1 


Therefore, the inequalities (1.9) of Theorem 1.13 are satisfied with f(x) = X v , a = 0, and 
/ = b p+1 l(p + 1). It follows that j' ( '; x 1 ' dx - b v+1 /(p + 1). 


1.24 The basic properties of the integral 

From the definition of the integral, it is possible to deduce the following properties. 
Proofs are given in Section 1.27. 

THEOREM 1.16. LINEARITY WITH RESPECT TO THE INTEGRAND. If bOtil f Olid BK ilh 

tegrableon [a, b], so is c t f +c 2 gfor everypair of constants c 1 and c, 2 ■ Furthermore, we have 

[ [cj(x) + c 2 g(x)] dx = c x [f(x) dx + c 2 \ a g(x) dx . 

Note: By use of mathematical induction, the linearity property can be generalized as 
follows: If f k , . . . ,f n are integrable on [a, b], then so is C k f x + . . . + c n f n for all real 
c 1 ,...,c n , and 

I ic k m dx = J,c k f f k (x) dx . 

“ k = i *=i 8 

THEOREM 1.17. ADDITIVITY WITH RESPECT TO THE INTERVAL OF INTEGRATION. If tWO 

of the following three integrals exist, the third also exists, and we have 

J f(x) dx + f f{x) dx = I f(x) dx . 

Ja do da 
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Note: In particular, iff is monotonic on [a, b] and also on [b, c], then both integrals 
P a f and J'/exist, so j"' f also exists and is equal to the sum of the other two integrals. 

theorem 1.18 . invariance under translation. If f is integrable on [a, b], then for 
every real c we have 

fi> Cb+c 

J f (x) dx = J f(x — c) dx . 


THEOREM 1.19. EXPANSION OR CONTRACTION OF THE INTERVAL OF INTEGRATION. If f IS 
integrable on [a, b], then for every real kfl^O we have 

f 7(x) dx = - P7 (“) dx . 

Ja kJka \kJ 

Note: In both Theorems 1.18 and 1.19, the existence of one of the integrals implies the 

existence of the other. When k = 1, Theorem 1.19 is called the reflection property. 

THEOREM 1.20. COMPARISON THEOREM. If both f and g are integrable on [a, b] and if 
g(x) <f(x)for every x in [a, b], then we have 

j a g(x) dx < _[ f(x) dx , 

An important special case of Theorem 1.20 occurs when g(x) = 0 for every x. In this 
case, the theorem states that if / (x) > 0 everywhere on [a, b], then f (x) dx > 0. In 
other words, a nonnegative function has a nonnegative integral. It can also be shown 
that if we have the strict inequality g(x) <f(x) for all x in [a, b], then the same strict 
inequality holds for the integrals, but the proof is not easy to give at this stage. 

In Chapter 5 we shall discuss various methods for calculating the value of an integral 
without the necessity of using the definition in each case. These methods, however, are 
applicable to only a relatively small number of functions, and for most integrable functions 
the actual numerical value of the integral can only be estimated. This is usually done by 
approximating the integrand above and below by step functions or by other simple functions 
whose integrals can be evaluated exactly. Then the comparison theorem is used to obtain 
corresponding approximations for the integral of the function in question. This idea will 
be explored more fully in Chapter 7. 


1.25 Integration of polynomials 

In Section 1.23 we established the integration formula 


( 1 . 10 ) 



P + 1 

P + 1 


for b > 0 andp any positive integer. The formula is also valid if b = 0, since both members 
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are zero. We can use Theorem 1.19 to show that (1.10) also holds for negative b. We simply 
take k — — 1 in Theorem 1.19 to obtain 


I x v dx = — ( (— x) p dx = (— l) p+1 f x® dx = 
Jq Jo Jo 


( -b y 


3>+l 


P + 1 


which shows that ( 1 .10) holds for negative b. The additive property J'j x® dx = Jo xI ‘ dx ■ 
Js ** dx now leads to the more general formula 


f 


i b P+l_ a v+l 

x dx = . 

p + 1 


valid for all real a and b, and any integer p > 0. 
Sometimes the special symbol 


P(x) 


b 

a 


is used to designate the difference P(b) — P(a). Thus the foregoing formula may also be 
written as follows: 



b V+l _ a v+i 

P + 1 


This formula, along with the linearity property, enables us to integrate every polynomial. 
For example, to compute the integral JJ(x 2 — 3x + 5) dx, we find the integral of each term 
and then add the results. Thus, we have 



3x + 5) dx = x dx 


■r 




3 

+ 5x 


ii 


3 

1 


33 _ 13 3 2 ^ ]2 ^31 - Jl 26 

= 3 3 5 + 5 1 ~T ~ 


12 + 



More generally, to compute the integral of any polynomial we integrate term by term: 


k =0 fc -0 k =° 


b k+1 — a k+1 
k + 1 


We can also integrate more complicated functions formed by piecing together various 
polynomials. For example, consider the integral JJ |x(2x — 1)| dx. Because of the absolute- 
value signs, the integrand is not a polynomial. However, by considering the sign of 
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x(2x — 1), we can split the interval [0, 1] into two subintervals, in each of which the inte- 
grand is a polynomial. As x varies from 0 to 1, the product x(2x — 1) changes sign at the 
point x = | ; it is negative if 0 < X < £ and positive if £ < x < 1 . Therefore, we use the 
additive property to write 

£ |x(2x — 1)| dx = — ( ft 1/2 x(2x - 1) dx+ f 1/2 *(2x - 1) dx 
= I/ 2 (x - 2x 2 ) dx + j 1/2 ( 2x 2 - X ) dx 

= (I - A) + (A - I) - 1 


1.26 Exercises 

Compute each of the following integrals. 


lJ S 

2 J 

3. 


x 2 dx' 

3 x 2 dx. 

-3 


1/2 


12 


4x 3 dx. 


■ ! i 

K 


4x 3 dx. 


5/ 4 dt. 

6 . 1 J 


5t 4 dt. 

7. J 1 (5x 4 — 4x 3 ) dx. 
c® 

8. J ^ (5x 4 _ 4 X 3) dx. 


11.1 

. 1 °: 

13.1 ° 

i4 J; 

15. i 2 (x - l)(3x - 1) dx. 

,0 

2 \{x - l)(3x - 1)| dx. 

i 

(2x - 5) 3 dx. 


( 8,3 + 6,2 - it + 5) dt. 
(u - 1 )(« - 2) du. 

(x + l) 2 dx. 
\x+\fdx. 


16. 

17 


J: 

J 


-l 


(t 2 +l)dt. 


19 

20 . 


18. J (x 2 - 3) 3 dx. 


. 1 5 x 2 (x - 5) 4 dx. 
o 

J 4 (x+4) 10 dx. [///«/: Theorem 1 .18,] 


10. J' (3x 2 - 4x + 2) dx. 

21. Find ail values of c for which 

(a) Jj x(l - x) dx = 0, (b) jj |x(l - x)\ dx = 0. 

22. Compute each of the following integrals. Draw the graph off in each case. 

if 0 < X <, 1, 
if 1 x < 2. 
if 0 < x < c, 


(a) jJ/M dx where f(x) = 

(b) j" 1 f(x) dx where f(x) = 


2 — x 
x 

1 - x 

c , if c < x < 1; 

1 - C — — ! 


c is a fixed real number, 0 < c < 1 . 

23. Find a quadratic polynomial P for which P(0) = P( 1) = 0 and jj P(x) dx = 1. 

24. Find a cubic polynomial P for which P(0) = P( -2) = 0, P(l) = 15, and 3 P(x) dx = 4 
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Optional exercises 

25. Let f be a function whose domain contains —x whenever it contains x. We say that f is an 
even function if f(—x) = f(x) and an odd function if f ( — x) = • f ( X) for all x in the domain 
off. 1' f is integrable on [0, b], prove that 

(a) J* /(*) dx = 2 J‘/(x) dx if / is even: 

(b) ( b f ( X) dx = 0 iff is odd. 

<J—b 

26. Use Theorems 1.18 and 1.19 to derive the formula 

p f(x) dx = ( b- a ) £ / [a + (b — a)x] dx . 

27. Theorems 1.18 and 1.19 suggest a common generalization for the integral f f(Ax + B) dx. 
Guess the formula suggested and prove it with the help of Theorems 1.18 and 1.19. Discuss 
also the case A =0. 

28. Use Theorems 1.18 and 1.19 to derive the formula 

f 6 f(c — x) dx = f c 0 f{x) dx . 

Ja J c-b 


1.27 Proofs of the basic properties of the integral 

This section contains proofs of the basic properties of the integral listed in Theorems 
1.16 through 1.20 in Section 1.24. We make repeated use of the fact that every functionf 
which is bounded on an interval [a, b] has a lower integral 1(f) and an upper integral 1(f) 
given by 

1(f) = sup { £ s I s < /} , 1(g) = inf {£ t\f <t), 

where S and t denote arbitrary step functions below and above f respectively. We know, 
by Theorem 1.9, thatfis integrable if and only if 1(f) = 1(f), in which case the value of the 
integral off is the common value of the upper and lower integrals. 

Proof of the Linearity Property (Theorem 1.16). We decompose the linearity property into 
two parts: 

( A ) i (/+?) = {/+ 1 £ ’ 

( B ) />=</> 

To prove (A), let 1(f) - Jf,/and let Z(g) = £ g. We shall prove that /(/ + g) = /(/ + g) = 

Kf) + 1(g)- 

Let and S., denote arbitrary step functions below f and g, respectively. Since f and g 
are integrable, we have 

1(f) = sup ( £ Si I Sj < /) , 1(g) = sup (£ S 2 1 s 2 < g) . 
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By the additive property of the supremum (Theorem 1.33), we also have 
(1-11) Kf) + 1(g) = SUP {£ Sj + £' S 2 | S l <f, S 2 < g} . 

But if ,v, < f and s 2 < g, then the sum S = Si + S 2 i s a step function below f + g, and we 
have 


f Si + { s 2 = [ s < /(/ + g) . 

Therefore, the number /(/ + g) is an upper bound for the set appearing on the right of 

(1.11) . This upper bound cannot be less than the least upper bound of the set, so we have 

(1.12) 1(f) + 1(g) < I(f + g) . 

Similarly, if we use the relations 

1(f) = inf (£ h\f < tj , Kg) = inf (£ t 2 g < ( 2 ) , 

where f and / 2 denote arbitrary step functions above f and g, respectively, we obtain the 
inequality 

(1.13) Kf + g) < Kf) \ 1(g) , 

Inequalities (1.12) and (1.13) together show that /(/ + g) = /(/ + g) = 1(f) + 1(g) ■ There- 
fore f g is integrable and relation (A) holds. 

Relation (B) is trivial if c = 0. If c > 0, we note that every step function .V, below cf is of 
the form = cs, where s' is a step function below /. Similarly, every step function /, above 
cf is of the form t 1 = ct, where / is a step function above f Therefore we have 

l(cf) = sup {£ s x | Si < cf) = sup {c _[ s | s < /) = cl(f) 

and 

I(cf) = inf {£ cf < tjj = inf {c £ t \f < fj = cl(f) . 

Therefore I(cf) = I(cf ) = cl(f). Here we have used the following properties of the 
supremum and infimum : 

(1.14) sup {cx | x G A) = c sup {x | x e A) , inf (ex | x g A} = c inf {x j x e A} , 

which hold if c > 0. This proves (B) if c > 0. 

If c < 0, the proof of (B) is basically the same, except that every step function s x below cf 
is of the form s x = ct, where / is a step function above f and every step function above 

cf is of the form t x = cs, where i is a step function below f Also, instead of (1.14) we use 

the relations 

sup {cx |x e A} = c inf {x |x e A} , inf {cx \ x e A} = c sup {x \ x e A} , 
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which hold if c < 0. We now have 


I{cf) = sup {/%! | s 1 < cf } = sup {c \f< (} = cinf [[ t\f <t} = cl(f) . 

Similarly, we find I(cf) = cl(f). Therefore (B) holds for all real c. 

Proof of Additivity with Respect to the Interval of integration (Theorem 1.17). Suppose 
that a < b < c, and assume that the two integrals f and Jj; f exist. Let 1(f) and 1(f) denote 
the upper and lower integrals ol’f over the interval [a, c]. We shall prove that 

(1.15) 1(f) = i(f) = \j + £/■ 

Ifs is any step function belowf’on [a, c], we have 


Conversely, if S\ and s 2 are step functions below f on [a, b] and on [b, c], respectively, then 
the function s which is equal to Sj on [a, b) and equal to s 2 on [b, c] is a step function below 
f on [a, c] for which we have 

f « = r + j s 2 . 

*a J b 

Therefore, by the additive property of the supremum (Theorem 1.33), we have 

1(f) = sup (£ s I s < /) = sup (JT Sx | Sx < /} + sup {{J S 2 1 s 2 < f) = jj + If. 
Similarly, we find 

'</>=£ /+!,'/. 

which proves (1.15) when a < b < C. The proof is similar for any other arrangement of 
the points a, b, C. 


Proof oftheTranslation Property (Theorem 1.18). Let g be the function defined on the 
interval [a + c, b + c] by the equation g(x) = f(x — c). Let _I(g) and 1(g) denote the lower 
and upper integrals of g on the interval [a + c, b + c]. We shall prove that 

(1.16) /(g) = 1(g) = f7(*) dx 

Let j be any step function below g on the interval [a + c, b + c]. Then the function Sx 
defined on [a, b] by the equation ^(x) = s(x + c) is a step function below f on [a, b]. 
Moreover, every step function below f on [a, b] has this form for some s below g. Also, 
by the translation property for integrals of step functions, we have 

s(x) dx = £ s(x + c) dx = Jp sj(x) dx . 
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Therefore we have 


1(g) = sup s I s < gj = sup ( £ S X | S x < /) = f/(x) • 

Similarly, we find 7(g) = f® ITx) dx, which proves (f.16). 

Proof of the Expansion Property (Theorem 1.19). Assume k > 0 and define g on the 
interval [ka, kb] by the equation g(x) = f(xlk). Let 1(g) and 1(g) denote the lower and 
upper integrals of g on [ka, kb]. We shall prove that 

(1.17) 7(g) = 7(g) = k jj(x) dx . 

Let s be any step function below g on [ka, kb]. Then the function Jj defined on [a, b] by 
the equation s/x) = s(kx) is a step function below f on [a, b]. Moreover, every step 
function s l below / on [a, b] has this form. Also, by the expansion property for integrals 
of step functions, we have 

rich rb n 

J ka s(x) dx = kf s(kx) dx = k J sfx) dx . 


Therefore we have 

1(g) = sup I f “ s I s < gj = sup ( k f St I Sl < /} = k l /(X) dx , 

Similarly, we find 1(g) = f(x) dx , which proves (1.17) if k > 0. The same type of proof 
can be used if k fO. 

Proof of the Comparison Theorem (Theorem 1.20). Assume g < f on the interval [a, b]. 
Let sbe any step function below g, and let t be any step function above/. Then we have 
fa s < )^ t, and hence Theorem 1.34 gives us 

f 8 = SU P If H s < g) < inf (f H/< = f /• 


This proves that g < fa f as required. 




2 

SOME APPLICATIONS OF INTEGRATION 


2.1 Introduction 

In Section 1.18 we expressed the area of the ordinate set of a nonnegative function as an 
integral. In this chapter we will show that areas of more general regions can also be 
expressed as integrals. We will also discuss further applications of the integral to concepts 
such as volume, work, and averages. Then, at the end of the chapter, we will study 
properties of functions defined by integrals. 

2.2 The area of a region hetween two graphs expressed as an integral 

If two functionsf and g are related by the inequality f(x) < g(x) for all x in an interval 
[a, b], we writef < g on [a, b]. Figure 2.1 shows two examples. If/ Kg on [a, h], the set 
S consisting of all points (x, y) satisfying the inequalities 

f(x) < Y < g(x ) . a < x <b, 

is called the region between the graphs off and g. The following theorem tells us how to 
express the area of S as an integral. 




Figure 2.1 The area of a region between two graphs expressed as an integral: 
a(S) = j ‘ [ g{x) -j(x)) dx. 
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theorem 2.1. Assume f and gare integrable and satisfy f <g on [a, b]. Then the region 
S between their graphs is measurable and its area a(S) is given by the integral 

(2.1) a(S) = P [g(x) - f{x)] tlx . 

a 

Proof. Assume first thatf and g are nonnegative, as shown in Figure 2.1(a). Let F and 
G denote the following sets: 

F = {(.x‘, y) | a < X K b, 0 < y < fix)} , G = {(•*> y) | a < x < b, 0 < y < g(x)j . 

That is, G is the ordinate set of g, and Fis the ordinate set off, minus the graph off. The 
region S between the graphs off and g is the difference S = G F. By Theorems 1.10 and 
1.11, both F and G are measurable. Since F £ G, the difference S = G — Fis also 
measurable, and we have 

a{S) = a(G) — a(F) = f g(x) dx — J fix) dx = f [g(x) -f(x) ] dx . 

da d a J a 

This proves (2.1) when f and g are nonnegative. 

Now consider the general case where / < g on [a, b], but f and g are not necessarily 
nonnegative. An example is shown in Figure 2.1(b). We can reduce this to the previous 
case by sliding the region upward until it lies above the x-axis. That is, we choose a positive 
number c large enough to ensure that 0 < f(x) + c < g(x) + c for all „v in [a, b]. By what 
we have already proved, the new region T between the graphs off + c and g + c is 
measurable, and its area is given by the integral 

a(T) = [(g(x) + c) - ifix) + c)] dx = £ [g(x) - fix)] dx . 

But T is congruent to S; so S is also measurable and we have 

«(S) = a(T) = f [g(x) - fix)] dx . 

da 


This completes the proof. 

2.3 Worked examples 

example 1. Compute the area of the region S between the graphs off and g over the 
interval [0, 2] if fix) = x(x — 2) and g(x) = x/2. 

Solution. The two graphs are shown in Figure 2.2. The shaded portion represents S. 
Since f < g over the interval [0, 2], we use Theorem 2.1 to write 

a(S) - f 2 [g(.x) - fix)] dx = [ 2 (^x - x 2 ) dx = ^ - j = l . 

\2 / 2 2 3 3 





Figure 2.2 Example 1 . Figure 2.3 Example 2. 


example 2. Compute the area of the region S between the graphs off and g over the 
interval [- 1,2] if f(x) = x and g(x) = x 3 /4. 

Solution. The region S is shown in Figure 2.3. Here we do not have f < g throughout 
the interval [ — 1, 2]. However, we do have f < g over the subinterval [ — 1,0] and g K f 
over the subinterval [0, 2]. Applying Theorem 2.1 to each subinterval, we have 

a (S) =J" i [g(x) - /(x)] dx + [/(x) - g(x)] dx 

1 (-1) 4 (-1) 2 2^ _ 12* = 23 

44 2 + 2 4 4 ~ 16 

In examples like this one, where the interval [a, b] can be broken up into a finite number 
of subintervals such that eitherf < g or g < f 1 tl each subinterval, formula (2.1) of Theorem 
2.1 becomes 

a(S) = J 6 |g(x) -f(x) I dx i 

•'a 


example 3. Area of a circular disk. A circular disk of radius r is the set of all points 
inside or on the boundary of a circle of radius r. Such a disk is congruent to the region 
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between the graphs of the two functions f and g defined on the interval [-Y, y] by the 
formulas 

g(x) = dz-2 and f(x) = — V r 2 — x 2 , 

Each function is bounded and piecewise monotonic so each is integrable on [-r, r]. 
Theorem 2.1 tells us that the region between their graphs is measurable and that its area is 
jL r [gM — f(x)] dx. Let A(r) denote the area of the disk. We will prove that 

A(r) = r 2 A( 1) . 

That is, the area of a disk of radius r is r 2 times the area of a unit disk (a disk of radius 1). 
Since g(x) -f(x) = 2 g(x), Theorem 2.1 gives us 

A(r) = J 2g(x) dx = 2 J V r 2 — x 2 dx 

In particular, when r ;= 1, tve have the formula 

A(l) = 2 f 1 Vj _ ^ dx . 

J—X 

Now we change the scale on the x-axis, using Theorem 1.19 with k = 1 /r, to obtain 
A (r) = 2 g(x) dx = 2 r g(rx) dx = 2r V / - ( rx f dx 

= 2r 2 J_ t ^ 1 — x 2 dx = r 2 A(l ) . 

This proves that A(r) —r 2 A(l), as asserted. 

DEFINITION. We define the number tt to be the area of a unit disk. 

The formula just proved states that A (r) = irr 2 . 

The foregoing example illustrates the behavior of area under expansion or contraction 
of plane regions. Suppose S is a given set of points in the plane and consider a new set of 
points obtained by multiplying the coordinates of each point of S by a constant factor 
k > 0. We denote this set by kS and say that it is similar to S. The process which produces 
kS from S is called a similarity transformation. Each point is moved along a straight line 
which passes through the origin to k times its original distance from the origin. If k > 1, 
the transformation is also called a stretching or an expansion (from the origin) and, if 
0 < k < 1, it is called a shrinking or a contraction (toward the origin). 

For example, if S is the region bounded by a unit circle with center at the origin, then 
kS is a concentric circular region of radius k. In Example 3 we showed that for circular 
regions, the area of kS is k 2 times the area of S. Now we prove that this property of area 
holds for any ordinate set. 
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example 4. Behavior of the area of an ordinate set under a similarity transformation. 

Let / be nonnegative and integrable on [a, h~\ and let S be its ordinate set. An example is 
shown in Figure 2.4(a). If we apply a similarity transformation with a positive factor k, 
then kS is the ordinate set of a new function, say g, over the interval [ka, kb]. [See Figure 
2.4(b).] A point (x, y) is on the graph of g if and only if the point ( x/k , yjk ) is on the graph 
off. Hence ylk = f{xjk), so y = kf{xjk). In other words, the new function g is related to 
f by the formula 

g(.\-) = kfix/k ) 




Figure 2.4 The area of kS is k 2 times that of S. 


for each x in [ka, kb]. Therefore, the area of kS is given by 

a(kS) = f" g(x) dx= k \ kb f(xlk) dx = k 2 j b f(x) dx , 

J fra Jka J a 

where in the last step we used the expansion property for integrals (Theorem 1.19). Since 
Ja /( x ) dx = a(S). this proves that a(kS) = k 2 a(S). In other words, the area of kS is k 2 times 
that of S. 


example 5. Calculation of the integral x 1/2 dx. The integral for area is a two-edged 
sword. Although we ordinarily use the integral to calculate areas, sometimes we can use 
our knowledge of area to calculate integrals. We illustrate by computing the value of the 
integral x 1/2 dx, where a > 0. (The integral exists since the integrand is increasing and 
bounded on [0, a].) 

Figure 2.5 shows the graph of the functionfgiven by f(x) = x 1 ' 2 over the interval [0, a]. 
Its ordinate set S has an area given by 

a(S) = f a x 1/2 dx 
Jo 

Now we compute this area another way. We simply observe that in Figure 2.5 the region 
S and the shaded region T together fill out a rectangle of base a and altitude a 1 ^ 2 , Therefore, 

a(S) + a(T) = a 312 , so we have 


a(S) = a 3/2 - a(T) . 
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But T is the ordinate set of a function g defined over the interval [0, a 112 ] on the y-axis by the 
equation g(y) = j 2 . Thus, we have 

C„l/2 f„l/2 

a(T) = g(y) dy = / dy = |a 3/2 , 

‘'O JO 

SO a(S) = o 3/2 — gfl 3/2 = §a 3/2 . This proves that 

J" o x 1/2 dx = f n 3/2 . 





More generally, if a > 0 and b > 0, we may use the additive property of the integral to 
obtain the formula 


P x 1/2 dx = f (b 3/2 - « 3/2 ) • 

J a 

The foregoing argument can also be used to compute the integral J® x 1 /” dx, if n is a 
positive integer. We state the result as a theorem. 


THEOREM 2,2, 

(2.2) 


For a > 0, b > 0 and n a positive integer, we have 
dx = 


A 


1 + 1/n . 


The proof is so similar to that in Example 5 that we leave the details to the reader. 
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2.4 Exercises 

In Exercises 1 through 14, compute the area of the region S between the graphs off and g over 
the interval [a, b] specified in each case. Make a sketch of the two graphs and indicate S by shading. 


1 . f{x) =4 - X 2 , 

<?(*) = o, 

a = -2, 

b = 2. 

2. f(x) = 4 - x 2 , 

g(x) = 8 - 2* 2 , 

a = -2, 

b = 2. 

3. f(x) = x 3 + x 2 , 

g(x) = x 3 + l. 

a = -1, 

6 = 1. 

4. f(x) = x - x 2 , 

g(x)= -x, 

a = 0, 

6 = 2. 

5. f(x) = *1/3, 

g(x) = x 1 ' 2 . 

a = 0, 

6=1. 

6. f(x) = x 1 ' 3 , 

g(x) = X 1 ' 2 , 

a = 1, 

6 = 2. 

7. f(x) = x 1 ' 3 , 

g(x) = X 112 , 

a = 0, 

6=2. 

8. f(x) = x 1 / 2 , 

g(x) = -V 2 , 

a =0, 

6=2. 

9. f(x) = x 2 , 

g(x) = x + 1, 

a = -1, 

6 = (1 + V 5)/2. 

10. f(x) = x(x 2 - 1), 

g(x) = X, 

a = -1, 

6 = a/2. 

11. /M= W, 

g(x) = x 2 - 1, 

a = -1, 

6=1. 

12. f(x) = |jc - 1|, 

g(x) = x 2 - 2x, 

a = 0, 

6 = 2. 

13. f(X) = 2 |*|, 

g(x) = 1 - 3x3, 

a = -V3/3, b = i. 

14. f(X) = |*| + |* - 

1|, g(x) = 0, 

a = -1, 

6 = 2. 


15. The graphs of f(x) = x 2 and g(x) = cx 3 , where c > 0, intersect at the points (0, 0) and 
(1/c, 1/e 2 ). Find c so that the region which lies between these graphs and over the interval 
[0, 1/e] has area f. 

16. Let f(x) = x — x 2 ,g(x) - ax. Determine a so that the region above the graph ofg and below 
the graph off has area f . 

17. We have defined n to be the area of a unit circular disk. In Example 3 of Section 2.3, we 
proved that n = 2 J* V 1 — jc 2 dx. Use properties of the integral to compute the following 
in terms of tt : 

( a ) j> 9 - x 3 dx: (b) J 0 Vl - \x 2 dx: (c) J 2 (x - 3)V 4 — x 2 dx. 

18. Calculate the areas of regular dodecagons (twelve-sided polygons) inscribed and circum- 
scribed about a unit circular disk and thereby deduce the inequalities 3 < v < 12(2— y/3). 

19. Let C denote the unit circle, whose Cartesian equation is x 2 + y 2 = 1. Let E be the set of 
points obtained by multiplying the x-coordinate of each point (x, y) on C by a constant factor 
a > 0 and the y-coordinate by a constant factor b> 0. The set E is called an ellipse. (When 

a = b, the ellipse is another circle.) 

(a) Show that each point (x, y) on E satisfies the Cartesian equation (x/af + ( yjb ) 2 = 1. 

(b) Use properties of the integral to prove that the region enclosed by this ellipse is measurable 
and that its area is Trab, 

20. Exercise 19 is a generalization of Example 3 of Section 2.3. State and prove a corresponding 
generalization of Example 4 of Section 2.3. 

21. Use an argument similar to that in Example 5 of Section 2.3 to prove Theorem 2.2. 

2.5 The trigonometric functions 

Before we introduce further applications of integration, we will digress briefly to discuss 
the trigonometric functions. We assume that the reader has some knowledge of the 
properties of the six trigonometric functions, sine, cosine, tangent, cotangent, secant, and 
cosecant; and their inverses, arc sine, arc cosine, arc tangent, etc. These functions are 
discussed in elementary trigonometry courses in connection with various problems involving 
the sides and angles of triangles. 




The trigonometric functions 


95 


The trigonometric functions are important in calculus, not so much because of their 
relation to the sides and angles of a triangle, but rather because of the properties they 
possess as functions. The six trigonometric functions have in common an important 
property known as periodicity. 

A function f is said to beperiodic with periodp 5 ^ 0 if its domain contains x + p whenever 
it contains x and if f(x + p) = f(x) for every x in the domain off. The sine and cosine 
functions are periodic with period 277, where rr is the area of a unit circular disk. Many 
problems in physics and engineering deal with periodic phenomena (such as vibrations, 
planetary and wave motion) and the sine and cosine functions form the basis for the 
mathematical analysis of such problems. 

The sine and cosine functions can be introduced in many different ways. For example, 
there are geometric definitions which relate the sine and cosine functions to angles, and 
there are analytic definitions which introduce these functions without any reference whatever 
to geometry. All these methods are equivalent, in the sense that they all lead to the same 
functions. 

Ordinarily, when we work with the sine and cosine we are not concerned so much with 
their definitions as we are with the properties that can be deduced from the definitions. 
Some of these properties, which are of importance in calculus, are listed below. As usual, 
we denote the values of the sine and cosine functions at x by sin x, cos x, respectively. 

FUNDAMENTAL PROPERTIES OF THE SINE AND COSINE. 

/. Domain of definition. The sine and cosine functions are dejined everywhere on the real 
line. 

2. Special values. We have COS 0 = sin = 1, cos n = — 1, 

3. Cosine of a difference. For all x and y, we have 


(2.3) 


COS (y — x) = cosycosx + sin v sin x. 


4. Fundamental inegualities. For 0 <fx \tt, we have 


(2.4) 


0 < cos x < 


sin x 

x 


< 


1 

COS x . 


From these four properties we can deduce all the properties of the sine and cosine that 
are of importance in calculus. This suggests that we might introduce the trigonometric 
functions axiomatically. That is, we could take properties 1 through 4 as axioms about the 
sine and cosine and deduce all further properties as theorems. To make certain we are not 
discussing an empty theory, it is necessary to show that there are functions satisfying the 
above properties. We shall by-pass this problem for the moment. First we assume that 
functions exist which satisfy these fundamental properties and show how further properties 
can then be deduced. Then, in Section 2.7, we indicate a geometric method of defining the 
sine and cosine so as to obtain functions with the desired properties. In Chapter 11 we also 
outline an analytic method for defining the sine and cosine. 
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theorem 2.3. If two functions sin and COS satisfy properties 1 through 4 , then they also 
satisfy the following properties: 

(a) Pythagorean identity, sin 2 x + cos 2 x = 1 for all x. 

(b) Special values, sin 0 = cos \tt = sin 77 = 0. 

(c) Even and oddproperties. The cosine is an even function and the sine is an oddfunction. 

That is, for all X we have 

cos (-x) = cos x, sin (-x) = —sin x. 

(d) CO-relations. For all x, we have 

sin (|77 + x) = cos x, cos (J 7 t + x ) = —sin * . 

(e) Periodicity. For all x, we have sin (x + 2x) = sin x, cos (x + 2n) = COS x. 

(f) Addition formulas. For all x and y, we have 

cos (x + y) s= cos x cos y — sin x sin y , 
sin(x + y) = sinxcosy + cosxsiny. 

(g) Difference formulas. For all a and b, we have 


■ , „ • a - b a + b 

sin a — sin b = 2 sin cos — 1 — 


, . . a - h, . a 

cos a — cos b — —2 sin sin — 


(h) Monotonicity. In the interval [0, ^rr], the sine is strictly increasing and the cosine is 
strictly decreasing. 


Proof. Part (a) follows at once if we take x = y in (2.3) and use the relation cos 0=1. 
Property (b) follows from (a) by taking x = 0, x = \n, x = 77 and using the relation 
sin ^77 = 1. The even property of the cosine also follows from (2.3) by taking y = 0. Next 
we deduce the formula 

(2.5) COS (I 77 — x) = sin x , 

by taking y = in (2.3). From this and (2.3), we find that the sine is odd, since 


cos 


77 — ( f - X ) 


sin (— x) = cos X ) = 

77 COS - x sin 77 sin x 


COS 


s 1 n x . 


This proves (c). To prove (d), we again use (2.5), first with x replaced by ^77 + x and then 
with x replaced by ~X. Repeated use of (d) then gives us the periodicity relations (e). 
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To prove the addition formula for the cosine, we simply replace X by —X in (2.3) and use 
the even and odd properties. Then we use part (d) and the addition formula for the cosine 
to obtain 


sin (x 4- y) = —cos |x + y + ^ ) = —cos X cos( y + -~j si 


sin xsjny + 


= COS X sin y + sin x cos y 


This proves (f). To deduce the difference formulas (g), we first replace y by -y in the 
addition formula for sin (x + y) to obtain 


sin (x -y) = sinxcosy — cosxsiny. 

Subtracting this from the formula for sin (x + y) and doing the same for the cosine function, 
we get 

sin (x + y) - sin (x - y ) —2 sin y cos x , 

cos (x -\- y) — COS (X - y ) s= - 2 s i n y s i n x . 

Taking x = (a + 6)/2, y = (a — b)jl, we find that these become the difference formulas 

in (g)- 

Properties (a) through (g) were deduced from properties 1 through 3 alone. Property 4 
is used to prove (h). The inequalities (2.4) show that cos x and sin x are positive if 

0 < X < £ 77 . Now, if 0 < b < a < \tt, the numbers (a + b)j 2 and (a — b)l 2 are in the 

interval (0, \tt), and the difference formulas (g) show that sin a > sin b and cos a < cos b. 
This completes the proof of Theorem 2.3. 

Further properties of the sine and cosine functions are discussed in the next set of 
exercises (page 104). We mention, in particular, two formulas that are used frequently in 
calculus. These are called the double-angle or duplication formulas. We have 


sin 2x = 2 sin x COS x . COS 2x = COS 2 x — sin 2 x = 1 — 2 sin 2 x . 


These are, of course, merely Special cases of the addition formulas obtained by taking 
y — x. The second formula for cos 2x follows from the first by use of the Pythagorean 
identity. The Pythagorean identity also shows that |cos x| < 1 and j sin x| < 1 for all x 


2.6 Integration formulas for the sine and cosine 

The monotonicity properties in part (h) of Theorem 2.3, along with the co-relations and 
the periodicity properties, show that the sine and cosine functions are piecewise monotonic 
on every interval. Therefore, by repeated use of Theorem 1.12, we see that the sine and 
cosine are integrable on every finite interval. Now we shall calculate their integrals by 
applying Theorem 1.14. This calculation makes use of a pair of inequalities which we state 
as a separate theorem. 


theorem 2.4. If 0 < a K\tt and n ]> 1, we have 

ka „ . „ a X' ka 


a X' ka y a V ka 

- > cos — < sin a < - / cos — . 

n 7—! it ft n 


(2.6) 
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Proof. The inequalities in (2.6) will be deduced from the trigonometric identity 

n 

(2.7) 2 sin \x 2 cos kx = sin (n + |)x — sin \x , 

k=l 

which is valid for n > 1 and all real x. To prove (2.7), we use one of the difference formulas 
(g) of Theorem 2.3 to write 


2 sin \x cos kx = sin (k + l)x “ sin (k ™ 2 )*^ 1 


Taking k ~ 1,2 , . . . , n and adding these equations, we find that the sum on the right 
telescopes and we obtain (2.7). 

If is not an integer multiple of tt we can divide both members of (2.7) by 2 sin to 
obtain 


fco s kx > n< "+ i)jt 
2 sin 1 

k=l 


sin kx 


Replacing n by n — 1 and adding 1 to both members we also obtain 


cos kx = sin (n - j)x + sin \ x 
2 sin lx 

k=0 * 

Both these formulas are valid if x ^ 2mn, where m is an integer. Taking x = ajn, where 
0< a < I77 we find that the pair of inequalities in (2.6) is equivalent to the pair 



, ,, a . I a \ 

sin {n + f) sin I — I 

a n \2 n 1 


n 



< sin a < 


a sin (n — |)- + sin 


2 sin 

n 

2 n 




This pair, in turn, is equivalent to the pair 


• / , IX a . ( a \ ^ Sin \lnf . _ . , ^a, . / a\ 

(2.8) sin (n + |) sin ( — ) < — - — - — sin a < sin ( n — |) - + sin ( — ) . 


Therefore, proving (2.6) is equivalent to proving (2.8). We shall prove that we have 
(2.9) 


sin (2n + 1)0 — sin 0 < S ' n ^ sin 2nd < sin (2n — 1)0 + sin 0 


0 


for 0 <[ 2nd < \rr . When 0 = a/(2«) this reduces to (2.8). 
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To prove the leftmost inequality in (2.9), we use the addition formula for the sine to 
write 

(2.10) sin (2n + 1)% = sin 2nd COS 0 + COS 2nd sin % < sin 2nd + sin 0 , 

where we have also used the inequalities 

COS 0 <— , 0 < cos 2nd < 1 , sin 0 > 0 , 

6 

all of which are valid since 0 < 2nd < \ir. Inequality (2.10) is equivalent to the leftmost 
inequality in (2.9). 

To prove the rightmost inequality in (2.9), we again use the addition formula for the sine 
and write 

sin (2/i 1)% = sin 2nd COS d — COS 2nd sin Q . 

Adding sin % to both members, we obtain 

(2.11) sin (2n — 1)0 + sin d = sin 2/t0(cos 0 + sin Q - — 2/10 j ^ 

V sin 2nd ! 

But since we have 


1 -- cos 2nd _ 2 sin 2 n0 _ sin nd 

sin 2nd 2 sin nd cos nd cos nd ’ 

the right member of (2.11) is equal to 

sin2//0(cos 0+ sin 0 — n n ^ \ 

\ cos nd' 


Therefore, to complete the proof of (2.9), we need only show that 


sin 2nd 


— sin 2nd 


cos 0 cos nd + sin 0 sin nd 
cos nd 
cos (n — 1)9 
cos nd 


( 2 . 12 ) 


cos (n — 1)0 , sin 0 
cos nd 0 . 


COS nd = cos (n 1)0 cos 6 sin (n 1)0 sin 0 

A 

< cos (n — 1)0 cos 0 < cos (n — 1)0 , 

sin 0 


But we have 
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where we have again used the fundamental inequality cos 0 < 0/( sin 0). This last relation 
implies (2.12), SO the proof of Theorem 2.4 is complete. 


theorem 2.5. If two functions sin and cossatisfy thefundamentalpropertie s 1 through 4, 
then for every real a we have 


(2.13) 

(2.14) 


cos x dx = sin a , 

[ sin x dx = 1 — cos a . 


Proof. First we prove (2.13), and then we use (2.13) to deduce (2.14). Assume that 
0 < a < \n. Since the cosine is decreasing on [ 0 , a], we can apply Theorem 1.14 in con- 
junction with the inequalities of Theorem 2.4 to obtain (2.13). The formula also holds 
trivially for a = 0, since both members are zero. The general properties of the integral can 
now be used to extend its validity to all real a. 

For example, if — < a <0, then 0 < — a < In, and the reflection property gives us 


I a cos x dx = — ° cos (-x) dx = — cos x dx = -sin (—a) = sin a . 

Jo Jo Jio 

Thus (2.13) is valid in the interval [— \tt, § 77 ]. Now suppose that \n <C a <C §77. Then 
— |t 7 < a — 7T < \tt, so we have 

P cos x dx = [”^ 2 cos x dx + f a cos x dx = sin + f a * cos (x + it ) dx 

Jt Jo Jr/2 * J- — ir/2 

r a — it 

= 1 — J cos x dx = 1 — sin (a — 77) + sin (— §77) = sin a . 

Thus (2.13) holds for all a in the interval [— § 77 , § 77 ]. But this interval has length 2n, so 
formula (2.13) holds for all a since both members are periodic in a with period 27T. 

Now we use (2.13) to deduce (2.14). First we prove that (2.14) holds when a = tt/2. 
Applying, in succession, the translation property, the co-relation sin (x + I 7 r) = cos x, 
and the reflection property, we find 



sin x dx 


' 0 

- s 

J— 77/2 


sin x -| — j dx 


-fc 

J-b/2 


cos x dx 



cos (— x) dx . 


Using the relation cos (-x) = cos x and Equation (2.13), we obtain 


W2 • A 1 

sin x dx = 1 . 


Now, for any real a, we may write 


ra [■ ir /2 f a ra-nl 2 I n 

sin x dx = sin x dx + sin x dx = 1 + sin I x H — 
J0 J C Jit/2 Jo \ 2 


dx 


= 1 + 1 ” 


-n/2 


cos x dx = 1 + sin 




cos a. 


This shows that Equation (2.13) implies (2.14). 
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EXAMPLE 1. Using (2.13) and (2.14) in conjunction with the additive property 

[ fix) dx = [ f{x) dx - [“/(*) dx, 

Ja Jo Jo 

we get the more general integration formulas 

) a COS x dx = sin b — sin a 
and 

j 6 sin x dx = (1 — cos b) — (1 — cos a) = —(cos b — cos a) . 

If again we use the special symbol fix) ||j to denote the difference f(b ) — f (a), w> 
these integration formulas in the form 


, COS X dx = sin x and 

8 a 


sin x dx = —COS X 


example 2. Using the results of Example 1 and the expansion property 

f b 1 f cb 

fix) dx = ~ \ f( x l c ) dx ' 

J a c J<en 


we obtain the following formulas, valid for c ^ 0: 


f b i f cb 1 

cos cx dx = - cos X dx = - (sin cb — sin ca), 

J a ^ J ca C 


and 


f b i C ci l 

I sin cx dx = - I sin x dx = — - (COS cb — COS ca). 

Ja C J ca C 

example 3. The identity cos 2x ~ 1 — 2 sin 2 x implies sin 2 x = j(l — COS 2x) 


Example 2, we obtain 


■ = - ( (1 — cos 2x) dx = - — - sin 2a . 
2 Jo 


2 4 


Since sin 2 x + COS 2 x = 1, we also find 


Ca Cl Ca 

cos 2 x dx = (1 — sin 2 x) dx = a — si 
J 0 J_( do t 


a 1 . 


sin x dx = - + - sin 2a ■ 
2 4 


can write 


so, from 
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2.7 A geometric description of the sine and cosine functions 

In this section we indicate a geometric method for defining the sine and cosine functions, 
and we give a geometric interpretation of the fundamental properties listed in the Section 2.5. 

Consider a circle of radius r with its center at the origin. Denote the point (r, 0) by A, 
and let P be any other point on the circle. The two line segments OA and OP determine a 
geometric configuration called an angle which we denote by the symbol /_AOP. An example 
is shown in Figure 2.6. We wish to assign to this angle a nonnegative real number x which 
can be used as a measurement of its size. The most common way of doing this is to take a 
circle of radius 1 and let x be the length of the circular arc AP, traced counterclockwise 

twice area of sector 




Figure 2.6 An angle /_ AOP consisting of x Figure 2.7 Geometric description of sin x 
radians. and cos x. 

from A to P, and to say that the measure of / AOP is x radians. From a logical point of 
view, this is unsatisfactory at the present stage because we have not yet discussed the 
concept of arc length. Arc length will be discussed later in Chapter 14. Since the concept 
of area has already been discussed, we prefer to use the area of the circular sector AOP 
rather than the length of the arc AP as a measure of the size of /AOP. It is understood 
that the sector AOP is the smaller portion of the circular disk when P is above the real axis, 
and the larger portion when P is beiow the real axis. 

Later, when arc length is discussed, we shall find that the length of arc AP is exactly 
twice the area of sector AOP. Therefore, to get the same scale of measurement for angles 
by both methods, we shall use twice the area of the sector AOP as a measure of the angle 
/_AOP. However, to obtain a “dimensionless” measure of angles, that is, a measure 
independent of the unit of distance in our coordinate system, we shall define the measure 
of f_AOPto be twice the Urea of sector AOP divided by the square of the radius. This ratio 
does not change if we expand or contract the circle, and therefore there is no loss in 
generality in restricting our considerations to a unit circle. The unit of measure so obtained 
is called the radian. Thus, we say the measure of an angle f_AOP is x radians if x/2 is the 
area of the sector AOP cut from a unit circular disk. 

We have already introduced the symbol 77 to denote the area of a unit circular disk. When 
P = (- 1, 0), the sector AOP is a semicircular disk of area \tt, so it subtends an angle of 77 
radians. The entire disk is a sector consisting of 277 radians. If P is initially at (1, 0) and if 
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P moves once around the circle in a counterclockwise direction, the area of sector AOP 
increases from 0 to it, taking every value in the interval [0, n] exactly once. This property, 
which is geometrically plausible, can be proved by expressing the area as an integral, but 
we shall not discuss the proof. 

The next step is to define the sine and cosine of an angle. Actually, we prefer to speak 
of the sine and cosine of a number rather than of an angles so that the sine and cosine will 
be functions defined on the real line. We proceed as follows: Choose a number x satisfying 
0 < x < 2 tt and let P be the point on the unit circle such that the area of sector AOP is 
equal to x/2. Let (a, b) denote the coordinates of P. An example is shown in Figure 2.7. 
The numbers a and b are completely determined by x. We define the sine and cosine of x 
as follows : 

cos x = a, sin x = b . 

In other words, cos x is the abscissa of P and sin x is its ordinate. 

For example, when x = ir, we have P = (- 1,0) so that cos 7 T = — I and sin n = 0 . 
Similarly, when x = we have P = (0, 1 ) and hence cos \tt = 0 and sin \tt = 1. This 
procedure describes the sine and cosine as functions defined in the open interval (0, 27r). 
We extend the definitions to the whole real axis by means of the following equations: 

sin 0 = 0 , COS 0=1, sin (x + 27r) = sin X , COS (x + 2tt) = COS x . 

The other four trigonometric functions are now defined in terms of the sine and cosine by 
the usual formulas, 


tan x = 


sin x 
cos x ’ 


cot x = 


COS x 
sin x ’ 


sec x = 


COS x ’ 


cscx = , 

sin x 


These functions are defined for all real x except for certain isolated points where the 
denominators may be zero. They all satisfy the periodicity property fix + 2 tt ) — fix). 
The tangent and cotangent have the smaller period tt. 

Now we give geometric arguments to indicate how these definitions lead to the funda- 
mental properties listed in Section 2.5. Properties 1 and 2 have already been taken care of 
by the way we have defined the sine and cosine. The Pythagorean identity becomes evident 
when we refer to Figure 2.7. The line segment OP is the hypotenuse of a right triangle whose 
legs have lengths |cos x\ and |sin x\. Hence the Pythagorean theorem for right triangles 
implies the identity COS 2 x + sin 2 x = 1. 

Now we use the Pythagorean theorem for right triangles again to give a geometric proof 
of formula (2.3) for cos (y — x). Refer to the two right triangles PAQ and PBQ shown in 
Figure 2.8. In triangle PAQ, the length of side AQ is |sinj — sin x|,the absolute value of 
the difference of the ordinates of Q and P. Similarly, AP has length |cos x — cos y]. If d 
denotes the length of the hypotenuse PQ, we have, by the Pythagorean theorem, 

d 2 = (sin y — sin x) 2 + (cos X — cos y) 2 . 

On the other hand, in right triangle PBQ the leg BP has length [1 — cos (y — x)| and the 
leg BQ has length | sin iy — ■ x)|. Therefore, the Pythagorean theorem gives us 


= [1 — cos (y — x)] 2 + sin 2 (y — x) . 
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Equating the two expressions for d 2 and solving for cos (y — x), we obtain the desired 
formula (2.3) for cos (y — x). 

Finally, geometric proofs of the fundamental inequalities in property 4 may be given by 
referring to Figure 2.9. We simply compare the area of sector OAP with that of triangles 
OQP and OAB. Because of the way we have defined angular measure, the area of sector 
OAP is \x. Triangle OAB has base 1 and altitude h, say. By similar triangles, we find 
111 1 = (sin x)l( cos x), so the area of triangle OAB is ih = |(sin x)/(cos x). Therefore, 
comparison of areas gives us the inequalities 


1 . .1 , 1 sinx 

- sin x COS x < - x < - — ~ . 

2 2 2 COS x 



sin ,. Y 
COST 


Figure 2.8 Geometric proof of the formula Figure 2.9 Geometric proof of the inequalities 

for cos ( v — x ). sin x 1 

7 0 < cos x < < . 

x cos x 


Dividing by | sin x and taking reciprocals, we obtain the fundamental inequalities (2.4). 

We remind the reader once more that the discussion of this section is intended to provide 
a geometric interpretation of the sine and cosine and their fundamental properties. An 
analytic treatment of these functions, making no use of geometry, will be described in 
Section 11.11. 

Extensive tables of values of the sine, cosine, tangent, and cotangent appear in most 
mathematical handbooks. The graphs of the six trigonometric functions are shown in 
Figure 2.10 (page 107) as they appear over one complete period-interval. The rest of the 
graph in each case is obtained by appealing to periodicity. 


2.8 Exercises 

In this set of exercises, you may use the properties of the sine and cosine listed in Sections 2.5 
through 2.7. 

1 . (a) Prove that sin m = 0 for every integer n and that these are the only values of x for which 
sin x = 0. 

(b) Find all real x such that cos x = 0. 

2. Find all real x such that (a) sin x = 1 ; (b) cos x = 1 ; (c) sin x = — 1 ; (d) cos x = - 1 • 

3. Prove that sin (x -f n) = -sin x and cos (x + n) = — COS x for all x 

4. Prove that sin 3x = 3 sin x — 4 sin 3 x and cos 3x = cos x — 4 sin 2 x cos x for all real x. 
Prove also that cos 3x = 4 cos 3 x ■» 3 cos x 
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5. (a) Prove that sin b = l, cos b =■ l\/ 3. [Hint: Use Exercise 4.] 

(b) Prove that sin b = i\/ 3, cos 

(c) Prove that sin \tt = cos b = -%V2. 

6. Prove that tan (x — y) = (tan x — tan y)K 1 + tan x tan y) for all x and y with tan x tan y jt 
— 1. Obtain corresponding formulas for tan (x + y) and cot (x + y). 

7. Find numbers A and B such that 3 sin (x + b) = A sin x + B cos x for all x 

8. Prove that if C and a are given real numbers, there exist real numbers A and B such that 
C sin (x + a) = A sin x + B cos x for all x 

9. Prove that if A and B are given real numbers, there exist numbers C and a, with C > 0, such 
that the formula of Exercise 8 holds. 

10. Determine C and a, with C > 0, such that C sin (x + a) = -2 sin x — 2 cos x for all x 

1 1 . Prove that if A and B are given real numbers, there exist numbers C and a, with C ^ 0, such 
that C cos (x + a) = A sin x + B cos x. Determine C and a if A = B — 1 . 

12. Find all real x such that sin x = cos x 

13. Find all real x such that sin x — cos x = 1. 

14. Prove that the following identities hold for all x and y . 

(a) 2cosxcosy = cos (x -y) + cos(x+y). 

(b) 2sinxsiny = cos(x-y) -cos(x+y). 

(c) 2sinxcosy =sin(x -y) +sin(x + y). 

15. If h ¥■ 0, prove that the following identities hold for all x 

sin (x + h) — sin x sin (/z/2) J + h\ 

h = h/2 C °V V ’ 

cos(x + li) — cosx sin (hj2) . / + h\ 

h “ hf2 ~ Sm [ x 2/. 


These formulas are used in differential calculus. 

16. Prove or disprove each of the following statements. 

(a) For all X ^ 0, we have sin 2x ^ 2 sin x. 

(b) For every x, there is a y such that cos (x + y) = cos x + cos y . 

(c) There is an x such that sin (x + y) = sin x + sin y for all y. 

(d) There is a y ^ 0 such that Jjj sin x dx = sin y. 

17. Calculate the integral |'^ sin x dx for each of the following values of a and b. In each case 
interpret your result geometrically in terms of areas. 

(a) a = 0, b = w/6. (e) a = 0, b = it, 

(b) a = 0, b= tt/ 4. (f) a = 0, b = 2 w, 

(c) a = 0, b = tt/ 3, (g) a = -1, b = 1. 

(d) a = 0, b = w/2. (h) a = — rr/6, b = w/4. 


Evaluate the integrals in Exercises 18 through 27. 

18. J T (x + sin x) dx. 23. \ \ + COS t\ dt. 


19. (x 2 + cos x) dx. 

20. I (sin x — COS x) dx. 

21. j* 12 sin COS x \ dx. 

22. [ A cost) dt. 


Y/2 

26. sin 2x dx. 

[ I rl3 x 

cos - dx. 
2 
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28. Prove the following integration formulas, valid for b^O: 


s; 

f* . 

sin 

Jo 


COS (a + bt) dt = - [sin fa + bx) — sin a], 

b 


- [cos (a + bx) — COS a], 
b 


sin (a + bt) dt ■ 

29. (a) Use the identity sin 3/ = 3 sin t —4 sin 3 / to deduce the integration formula 


I* sin 3 1 dt = f - 1(2 + sin 2 x) cos x . 

(b) Derive the identity COS 3t = 4 cos 3 / — 3 cos / and use it to prove that 


cos 3 1 dt = J(2 + cos 2 x) sin x . 

30. If a function f is periodic with period p > 0 and integrable on [0, p ], prove that f(x) dx = 
j-a+s /•(_*) dx for all a. 

31. (a) Prove that Jjj” sin nx dx = Jq 17 cos nx dx = 0 for all integers n 0. 

(b) Use part (a) and the addition formulas for the sine and cosine to establish the following 
formulas, valid for integers m and n, 


f sin nx cos mx dx = * sin nx sin mx dx = f * cosnxcosmxdx =0, 

Jo jo J( I 


I 1 sin 2 nx dx = f 
Jo Jo 


cos 2 nx dx = tt , if n^O. 


These formulas are known as the orthogonality relations for the sine and cosine. 
32. Use the identity 


X XX 

2 sin - cos kx = sin (2k + 1) - — sin ( 2k — 1) - 

2 2 2 


and the telescoping property of finite sums to prove that if x ^2 mir (m an integer), we have 

sin l nx cos |(« + l)x 


cos kx = 


sin \x 


33. If x jt 2mn (m an integer), prove that 


. sin \nx sin |(« + l)x 

y sin kx = ^ , 

sin 


34. Refer to Figure 2.1. By comparing the area of triangle OAP with that of the circular sector 
OAP, prove that sin x < x if 0 < x < \n. Then use the fact that sin (-x) = —sin x to prove 
that |sin x\ < |x) if 0 < |*[ < 




2 - 
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2.9 Polar coordinates 

Up to now we have located points in the plane with rectangular coordinates. We can 
also locate them with polar coordinates. This is done as follows. Let P be a point distinct 
from the origin. Suppose the line segment joining P to the origin has length r > 0 and 
rnskes an angle of 0 radians with the positive x-axis. An example is shown in Figure 2. 11. 
The two numbers r and 0 are called polar coordinates of P. They are related to the rec- 
tangular coordinates (x, y) by the equations 

(2.15) x = r cos 0 , y = r sin 0. 



Figure 2.11 Polar coordinates. Figure 2.12 A figure-eight curve with polar 

equation r = V|sin 0|. 


The positive number r is called the radial distance of P, and 0 is called a polar angle. We 
Say a polar angle rather than the polar angle because if 0 satisfies (2.15), so does 0 + 2mr 
for any integer n. We agree to call all pairs of real numbers (r, 0) polar coordinates of P if 
they satisfy (2.15) with r > 0. Thus, a given point has more than one pair of polar 
coordinates. The radial distance r is uniquely determined, r = vx 2 + y 2 , but the polar 
angle 0 is determined only up to integer multiples of 2tt. 

When P is the origin, the equations in (2.15) are satisfied with r = 0 and any 0. For this 
reason we assign to the origin the radial distance r = 0, and we agree that any real 0 may 
be used as a polar angle. 

Letfbe a nonnegative function defined on an interval fa, b]. The set of all points with 
polar coordinates (r, 0) satisfying r = f(0) is called the graph off in polar coordinates. 
The equation r =/(0) is called a polar equation of this graph. For some curves, polar 
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equations may be simpler and more convenient to use than Cartesian equations. For 
example, the circle with Cartesian equation x 2 + y 2 — 4 has the simpler polar equation 
r = 2. The equations in (2.15) show how to convert from rectangular to polar coordinates. 

example. Figure 2.12 shows a curve in the shape of a figure eight whose Cartesian 
equation is (x 2 + y 2 ) 3 = y 2 . Using (2.15), we find x 2 -f T 2 = r 2 , so the polar coordinates of 
the points on this curve satisfy the equation r 6 = r 2 sin 2 0, or r 2 = | sin 0 1 , r = V | sin 0|. 
It is not difficult to sketch this curve from the polar equation. For example, in the interval 
0 < 0 < 7r/2, sin e increases from 0 to 1, so r also increases from 0 to 1. Plotting a few 
values which are easy to calculate, for example, those corresponding to 0 = 77/6, 77/4, and 
77/3, we quickly sketch the portion of the curve in the first quadrant. The rest of the curve 
is obtained by appealing to symmetry in the Cartesian equation, or to the symmetry and 
periodicity of | sin 0|. It would be a more difficult task to sketch this curve from its 
Cartesian equation alone. 


2.10 The integral for area in polar coordinates 

Let f be a nonnegative function defined on an interval (a, b], where 0 <j b — a < 2n, 
The set of all points with polar coordinates (r, 0) satisfying the inequalities 


0 < r < f(8) a <0 < b, 




Figure 2.14 The radial set of a step 
function s is a union of circular sectors. 
Its area is |J^ 2 (6) d0. 


is called the radial set offover [a,b]. The shaded region shown in Figure 2.13 is an example. 
If f is constant on [a, b], its radial set is a circular sector subtending an angle of b — a 
radians. Figure 2.14 shows the radial set S of a step function s. Over each of the n open 
subintervals (0 k _ , , d k ) of [a, b] in which i is constant, say j:(0) = s k , the graph of s in polar 
coordinates is a circular arc of radius s k , and its radial set is a circular sector subtending an 
angle of Oj, — 6 I ^_ 1 radians. Because of the way we have defined angular measure, the area 
of this sector is |(0 A — Q k -i)s% . Since b — a < 277 , none of these sectors overlap so, by 
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additivity, the area of the radial set of s over the full interval [a, b] is given by 
a(S) = 1 l si- ( e k - Q k _,) = i £ s\fJ) dd , 

k= 1 

where s 2 (0) means the square of s(0). Thus, for step functions, the area of the radial set has 
been expressed as an integral. Now we prove that this integral formula holds more 
generally. 

theorem 2.6. LetR denote the radial set of a nonnegative function f over an interval 
[a, b], where 0 <b — a <2tt, and assume that R is measurable Iff 2 is inferable on [a, b] 
the area of R is given by the integral 

a(R) = l j*f\6) dd. 


Proof. Choose two step functions J and t satisfying 

o < m <m < m 

for all 0 in [a, b], and let S and T denote their radial sets, respectively. Since s < f K t on 
[a, b ], the radial sets are related by the inclusion relations S c R <z T. Hence, by the 
monotone property of area, we have a(S) < a(R) < a(T). But S and T are radial sets of 
step functions, so a(S ) = s8 (0) dd and a(T) = t 2 (0) d6. Therefore we have the 

inequalities 

P s 2 (6) de < 2a(R) < f 6 1\6) de , 

da da 

for all step functions s and t satisfying s < f <t on [a, b]. But s 2 and t 2 are arbitrary step 
functions satisfying s 2 < f 2 < t 2 on [a, b] hence, since f 2 is integrable, we must have 
2a{R) = fj 2 (6) dB. This proves the theorem. 

N ote: It can be proved that the measurability of R is a consequence of the hypothesis 

that f 2 is integrable, but we shall not discuss the proof. 

example. To calculate the area of the radial set R enclosed by the figure-eight curve 
shown in Figure 2.12, we calculate the area of the portion in the first quadrant and multiply 
by four. For this curve, we have/ 2 (0) = 1 sin 0| and, since sin 0 ^ 0 for 0 K 6 K 77-/2, we 
find 

a(R) = 4p 2 if 2 (6) de = 2 1** 'sin 0 d6 = 2(:os 0~ cos = 2. 


2.11 Exercises 

In each of Exercises 1 through 4, show that the set of points whose rectangular coordinates 
(x, y) satisfy the given Cartesian equation is equal to the set of all points whose polar coordinates 
(r, 0) satisfy the corresponding polar equation. 
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1. (X - l) 2 + / = 1; r_ = 2 cos 0, cos 6 >0 . 

2. x 2 + y 2 — x = V x 2 + y 2 \ r = 1 + cos 0. 

3. ( x 2 + y 2 ) 2 = x 2 - y 2 ,y 2 < x 2 ; r = V cos 20, cos 26 > 0. 

4. ( x 2 + y 2 ) 2 = \x 2 — y 2 | ; r = V|cos 20|. 

In each of Exercises 3 through 15, sketch the graph off in polar coordinates and compute the 
area of the radial set offover the interval specified. You may assume each set is measurable. 

5. Spiral of Architnedes: f(6) -0, 0 < 6 < 2rr, 

6. Circle tangent to y-axis: /(0) = 2 cos 0, -n/2 < 0 < w/2. 

7. Two circles tangent to y-axis: f(B) = 2 |cos 0|, 0 < 0 < iw. 

8. Circle tangent to x-axis: /(0) = 4 sin 0, 0 < 6 < n, 

9. Two circles tangent to x-axis: /(0) = 4 |sin0|, 0 < 6 < 2t t. 

10. Rosepetal: f (0) = sin 20, 0 <C 0 < ir/2. 

11. Four-leaved rose: f(0) = |sin 20|, 0 < 6 < 2w, 

12. Lazy eight: f(0) = -(cos 0|, o < 0 < 2v. 

13. Four-leaf clouer: f(8) = \/|cos 20|, o< 0 ^ 2 tt. 

14. Cardioid: f (0) = 1 + cos 0, 0 < 6 < 2n. 

15. Limacon: f(ff) = 2 + cos 0, 0 < 0 < 2tt. 

2.12 Application of integration to the calculation of volume 

In Section 1.6 we introduced the concept of area as a set function satisfying certain 

properties which we took as axioms for area. Then, in Sections 1.18 and 2.2, we showed 

that the areas of many regions could be calculated by integration. The same approach can 
be used to discuss the concept of volume. 

We assume there exist certain sets S of points in three-dimensional space, which we call 
measurable sets, and a set function v, called a volume function, which assigns to each 
measurable set S a number v(S), called the volume of S. We use the symbol to denote 
the class of all measurable sets in three-dimensional space, and we call each set S in j/ a 

solid, 

As in the case of area, we list a number of properties we would like volume to have and 
take these as axioms for volume. The choice of axioms enables us to prove that the volumes 
of many solids can be computed by integration. 

The first three axioms, like those for area, describe the nonnegative, additive, and 
difference properties. Instead of an axiom of invariance under congruence, we use a 
different type of axiom, called C aval ieri'sprinci pie. This assigns equal volumes to congruent 
solids and also to certain solids which, though not congruent, have equal cross-sectional 
areas cut by planes perpendicular to a given line. More precisely, suppose S is a given solid 
and L a given line. If a plane F is perpendicular to L, the intersection F r* S is called a 
cross-section perpendicular to L. If every cross-section perpendicular to L is a measurable 
set in its own plane, we call S a Cavalieri solid, Cavalieri’s principle assigns equal volumes 
to two Cavalieri solids, S and T, if afS n F) = a(T n F) for every plane F perpendicular 
to the given line L. 

Cavalieri’s principle can be illustrated intuitively as follows. Imagine a Cavalieri solid 
as being a stack of thin sheets of material, like a deck of cards, each sheet being perpendicular 
to a given line L. If we slide each sheet in its own plane we can change the shape of the solid 
but not its volume. 

The next axiom states that the volume of a rectangular parallelepiped is the product of 
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the lengths of its edges. A rectangular parallelepiped is any set congruent to a set of the form 

(2.16) {(x, y, z)|0 <x<a, 0 <y<b, 0 < z < c} . 

We shall use the shorter term “box” rather than “rectangular parallelepiped.” The non- 
negative numbers a , b, c in (2.16) are called the lengths of the edges of the box. 

Finally, we include an axiom which states that every convex set is measurable. A set is 
called convex if, for every pair of points P and Q in the set, the line segment joining P and 
Q is also in the set. This axiom, along with the additive and difference properties, ensures 
that all the elementary solids that occur in the usual applications of calculus are measurable. 

The axioms for volume can now be stated as follows. 

axiomatic definition of volume. We assume there exists a class of solids and a 
set function v, whose domain is stf, with the following properties: 

1. Nonnegative property. For each set S in sf we have v(S) > 0. 

2. Additive property. If S and Tare in , then S U T and S Cl T are in sf, and we have 

v(S u T) = v(S) + v(T) - v(S n T) . 

3. Difference property. I f S and T are in ,r/ with S £ T, then T — S is in ,. r /, and we 

have v(T ~ S) = v(T) - v(S). 

4. Cavalier? s principle. If S and T are two Cavalieri solids in s4 with a(S n F) < 
a(T O F) for every plane Fperpenclicular to a given line, then v(S) < v(T). 

5. Choice of scale. Every box B is in sf . I f the edges ofB have lengths a, b, and c, then 
v(B) = abc. 

6. Every convex set is in sf , 

Axiom 3 shows that the empty set 0 is in si and has zero volume. Since v(T — S) > 0, 
Axiom 3 also implies the following monotone property : 

v(S) < v(T), for sets S and T in x/ with S c r . 

The monotone property, in turn, shows that every bounded plane set S in has zero 
volume. A plane set is called bounded if it is a subset of some square in the plane. If we 
consider a box B of altitude c having this square as its base, then S c B so that we have 
v(S) < v(B) = aV, where a is the length of each edge of the square base. If we had v(S) > 0, 
we could choose c so that c < v(S)/a 2 , contradicting the inequality v(S) <C a 2 c. This shows 
that a(S) cannot be positive, so v(S) = 0, as asserted. 

Note that Cavalieri’ s principle has been stated in the form of inequalities. If a(S O F) = 
ct(T n F) for every plane F perpendicular to a given line, we may apply Axiom 5 twice to 
deduce v(S) < v(T) and v(T) < v(S), and hence we have v(T) = v(S). 

Next we show that the volume of a right cylindrical solid is equal to the area of its base 
multiplied by its altitude. By a right cylindrical solid we mean a set congruent to a set S 
of the form 


5 = {(x, y, Z)\ (x, v) e B, a<z<b}, 
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where B is a bounded plane measurable set. The areas of the cross sections of S perpen- 
dicular to the z-axis determine a cross-sectional area function a s which takes the constant 
value a(B) on the interval a <z < b, and the value 0 outside [a, b]. 

Now let T be a box with cross-sectional area function a T equal to a,. Axiom 5 tells us 
that v(T) = a(B)(b — a), where a(B) is the area of the base of T, and b — ■ a is its altitude. 
Cavalieri’s principle states that v(S ) = v(T), so the volume of S is the area of its base, 
a(B), multiplied by its altitude, b — a. Note that the product a(B)(b — a) is the integral 
of the function a, over the interval /a. b]. In other words, the volume of a right cylindrical 
solid is equal to the integral of its cross-sectional area function. 


v(S) = f a s (z)dz. 

We can extend this formula to more general Cavalieri solids. Let R be a Cavalieri solid 
with measurable cross-sections perpendicular to a given line L. Introduce a coordinate 
axis along L (call it the u-axis), and let a R (u) be the area of the cross section cut by a plane 
perpendicular to L at the point u. The volume of R can be computed by the following 
theorem. 

theorem 2.7. Let R be a Cavalieri solid in si with a cross-sectional areafimction a R which 
is integrable on an interval [a, b] and zero outside [a, b]. Then the volume of R is equal to 
the integral of the cross-sectional area: 

v(R) = J a R (u) du . 

Proof. Choose step functions s and t such that s < < t on [a, b] and define s and t 

to be zero outside [a, b ]. For each subinterval of [a, b] on which s is constant, we can 
imagine a cylindrical solid (for example, a right circular cylinder) constructed so that its 
cross-sectional area on this subinterval has the same constant value as s. The union of these 
cylinders over all intervals of constancy of s is a solid S whose volume v(S) is, by additivity, 
equal to the integral s(u) du. Similarly, there is a solid T, a union of cylinders, whose 
volume v(T) = t(u) du. But a s (u) = s(u ) < a,(u) < t(u) = a T (u) for all u in [a. b], so 
Cavalieri’s principle implies that v(S) < v(R) < v(T). In other words, v(R) satisfies the 
inequalities 

ft ft 

s(u) du < v(R) < t(u) du 
Ja Ja 

for all step functions s and t satisfying 5 < a, < t on [a, b]. Since a s is integrable on [a, b], 
it follows that v(R) = J* a,(u) du. 

example. Volume of a solid of revolution. Let f be a function which is nonnegative and 
integrable on an interval t a > b], If the ordinate set of this function is revolved about the 
x-axis, it sweeps out a solid of revolution. Each cross section cut by a plane perpendicular 
to the x-axis is a circular disk. The area of the circular disk cut at the point x is 
where f \x) means the square off(x). Therefore, by Theorem 2.7, the volume of the solid 
(if the solid is in si) is equal to the integral J* ttf 2 {x) dx, if the integral exists. In particular. 
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if f(x) = yV 2 — X 2 for -Y <x < r, the ordinate set off is a semicircular disk of radius r 
and the solid swept out is a sphere of radius r. The sphere is convex. Its volume is equal to 

rrf\x) dx = 7 r | r (r 2 - x 2 ) dx = 2 tt | (r 2 - x 2 ) dx = fvrr 3 . 

More generally, suppose we have two nonnegative functions f and g which are integrable 
on an interval [a, b] and satisfy f < g on [a, b]. When the region between their graphs is 
rotated about the x-axis, it sweeps out a solid of revolution such that each cross section cut 
by a plane perpendicular to the x-axis at the point x is an annulus (a region bounded by two 
concentric circles) with area Trg 2 (x) — irf 2 (x). Therefore, if g 2 — f 2 is integrable, the volume 
of such a solid (if the solid is in is given by the integral 

r - / 2 (x)i dx 

*>a 


2.13 Exercises 

1 . Use integration to compute the volume of a right circular cone generated by revolving the 
ordinate set of a linear function f(x) = cx over the interval 0 < x < ft. Show that the result 
is one-third the area of the base times the altitude of the cone. 

In each of Exercises 2 through 7, compute the volume of the solid generated by revolving the 

ordinate set of the function fover the interval indicated. Sketch each °f the ordinate sets. 

2. f(x) = \/ x, 0 < x < 1. 5. f(x) = sin x, 0 <x < n. 

3. f(x) = x ^ 4 , 0 < x < 1 . 6. f(x) = cos x, 0 < x < w/2. 

4. f(x) = x 2 , - 1 < x < 2. 7. f(x) = sin x + cos x, 0 < x < v, 

In each of Exercises 8 through 11. sketch the region between the graphs offand g and compute 

the volume of the solid obtained by rotating this region about the x-axis. 

8. f(x) = V x, g(x) =1, 0 < x < 1. 

9. f(x) = V*, <?(*) = X 2 , 0 < x < 1. 

10. f(x) = sin x, g(x) = COS x, 0 < x < tt/4, 

11. /(x) = y 4 - x 2 , g{x) = 1, _ 0 < x < Vi- 

12. Sketch the graphs of f(x) = \/ x and g(x) = x/2 over the interval [0, 2], Find a number t, 

1 < t < 2, so that when the region between the graphs off and g over the interval [0, t ] is 
rotated about the x-axis, it sweeps out a solid of revolution whose volume is equal to w t 3 /3. 

13. What volume of material is removed from a solid sphere of radius 2r by drilling a hole of radius 
r through the center? 

14. A napkin-ring is formed by drilling a cylindrical hole symmetrically through the center of a 
solid sphere. If the length of the hole is 2 h, prove that the volume of the napkin-ring is tm/i 3 , 
where a is a rational number. 

15. A solid has a circular base of radius 2. Each cross section cut by a plane perpendicular to a 
fixed diameter is an equilateral triangle. Compute the volume of the solid. 

16. The cross sections of a solid are squares perpendicular to the x-axis with their centers on the 
axis. If the square cut off at x has edge 2x 2 , find the volume of the solid between x = 0 and 

x = a. Make a sketch. 

17. Find the volume of a solid whose cross section, made by a plane perpendicular to the x-axis, 
has the area cx 2 + bx + c for each x in the interval 0 < x < h. Express the volume in terms 
of the areas B lt M, and B 2 of the cross sections corresponding to x = 0, x = hfl, and x = h, 
respectively. The resulting formula is known as theprismoidformula. 
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18. Make a sketch of the region in the xy-plane consisting of all points (x, y) satisfying the simul- 
taneous inequalities 0 < x < 2, < y < 1 . Compute the volume of the solid obtained by 

rotating this region about (a) the x-axis; (b) the y-axis; (c) the vertical line passing through 
(2,0); (d) the horizontal line passing through (0, 1). 


2.14 Application of integration to the concept of work 

Thus far our applications of integration have been to area and volume, concepts from 
geometry. Now we discuss an application to work, a concept from physics. 

Work is a measure of the energy expended by a force in moving a particle from one point 
to another. In this section we consider only the simplest case, linear motion. That is, we 
assume that the motion takes place along a line (which we take as the x-axis) from one 
point, say x = a, to another point, x = b, and we also assume that the force acts along this 
line. We permit either a < b or b < a. We assume further that the force acting on the 
particle is a function of the position. If the particle is at x, we denote by f (x) the force acting 
on it, where f (x) > 0 if the force acts in the direction of the positive x-axis, and f(x) < 0 if 
the force acts in the opposite direction. When the force is constant, say f(x) = c for all 
x between a and b, we define the work done by f to be the number c ' (b — a), force times 
displacement. The work may be positive or negative. 

If force is measured in pounds and distance in feet, we measure work in foot-pounds’, 
if force is in dynes and distance in centimeters (the cgs system), work is measured in dyne- 
centimeters. One dyne-centimeter of work is called an erg. If force is in newtons and 
distance in meters (the inks system), work is in newton-meters. One newton-meter of work 
is called a joule. One newton is 10 5 dynes, and one joule is 10 7 ergs. 

example. A stone weighing 3 pounds (lb) is thrown upward along a straight line, rising 
to a height of 15 feet (ft) and returning to the ground. We take the x-axis pointing up along 
the line of motion. The constant force of gravity acts downward, so f (x) = -3 lb for each 
x, 0 < x < 15. The work done by gravity in moving the stone from, say, x = 6 ft to 
x = 15 ft is -3 1 (15 — 6) = -27 foot-pounds (ft-lb). When the same stone falls from 
x s= 15 ft to x = 6 ft, the work done by gravity is —3(6 — 15) = 27 ft-lb. 

Now suppose the force is not necessarily constant but is a given function of position de- 
fined on the interval joining a and b. How do we define the work done by /in moving a 
particle from a to b ? We proceed much as we did for area and volume. We state some 
properties of work which are dictated by physical requirements. Then we prove that for 
any definition of work which has these properties, the work done by an integrable force 
function / is equal to the integral fix) dx. 

fundamental properties of work . Let W a (f) denote the work done by a force function 
fin moving a particle from a to b. Then work has the following properties: 

1 . Additiveproperty. If a < c b, then Wf f) - W°(f) + W b c {f). 

2. Monotone property. Iff < g on [a, b], then Wff) < W b (g). That is, a greater force 
does greater work. 

3. Elementary formula. Iff is constant, say f (x) = cfor all x in the open interval (a, b), 
then W b ( f ) = C ' (b — a). 

The additive property can be extended by induction to any finite number of intervals. 
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That is, if a = x 0 < x, < 1 ■ . x n = b, we have 

K(f) = iw k , 

1 

where W k is the work done by / from x k , to x k . In particular, if the force is a step function 
S which takes a constant value s k on the open interval (x k _ l5 x k ), property 3 states that 
W k = s k • (x k - AVi), so we have 

C(s) = ■ (** “ %t) =J a ^ ■ 

Thus, for step functions, work has been expressed as an integral. Now it is an easy matter 
to prove that this holds true more generally. 

theorem 2.8. Suppose work has been defined for a class of force functions f in such a 
way that it satisfies properties f 2, and 3. Then the work done by an integrable force fimction 
fin moving a particle front a to b is equal to the integral off 

Kif ) = J7to dx . 

da 

Proof. Let 5 and t be two step functions satisfying s </ < t on [a. b]. The monotone 
property of work states that Wfis) < Wfi f) <C I Vfif). But Wfis) = s(x) dx and TV fit) = 
t(x) dx, so the number Wfif) satisfies the inequalities 

P s(x) dx < Wfif) < \ b t(x) dx 

da d a 

for all step functions S and t satisfying S < / < t on [a, b]. Since / is integrable on [a, b], 
it follows that Wfif) dx. 

Note: Many authors simply define work to be the integral of the force function. 

The foregoing discussion serves as motivation for this definition. 

example. Work required to stretch a spring. Assume that the force f(x) needed to 
stretch a steel spring a distance x beyond its natural length is proportional to x (Hooke’s 
law). We place the x-axis along the axis of the spring. If the stretching force acts in the 
positive direction of the axis, we have f(x) = cx, where the spring constant c is positive. 
(The value of c can be determined if we know the force f(x) for a particular value of x ^ 0.) 
The work required to stretch the spring a distance a is fix) dx = J“ cx dx = ca 2 / 2, a 
number proportional to the square of the displacement. 

A discussion of work for motion along curves other than straight lines is carried OUt in 
Volume II with the aid of line integrals. 

2.15 Exercises 

In Exercises 1 and 2 assume the force on the spring obeys Hooke’s law. 

1. If a ten-pound force stretches an elastic spring one inch, how much work is done in stretching 
the spring one foot? 
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2. A spring has a natural length of 1 meter (m). A force of 100 newtons compresses it to 0.9 m. 
How many joules of work are required to compress it to half its natural length? What is the 
length of the spring when 20 joules of work have been expended? 

3. A particle is moved along the x-axis by a propelling force f(x) = 3a' 2 + 4x newtons. Calculate 
how many joules of work are done by the force to move the particle (a) from x = 0 to x = 7 m; 
(b) from x = 2 m to x = 7 m. 

4. A particle is to be moved along the x-axis by a quadratic propelling force f(x) = ax 2 + bx 
dynes. Calculate a and b so that 900 ergs of work are required to move the particle 10 centi- 
meters (cm) from the origin, if the force is 65 dynes when x = 5 cm. 

5. A cable 50 feet in length and weighing 4 pounds per foot (lb/ft) hangs from a windlass. Cal- 
culate the work done in winding up 25 ft of the cable. Neglect all forces except gravity. 

6. Solve Exercise 5 if a 50 pound weight is attached to the end of the cable. 

7. A weight of 150 pounds is attached at one end of a long flexible chain weighing 2 lb/ft. The 
weight is initially suspended with 10 feet of chain over the edge of a building 100 feet in height. 
Neglect all forces except gravity and calculate the amount of work done by the force of gravity 
when the load is lowered to a position 10 feet above the ground. 

8. In Exercise 7, suppose that the chain is only 60 feet long and that the load and chain are allowed 
to drop to the ground, starting from the same initial position as before. Calculate the amount 
of work done by the force of gravity when the weight reaches the ground. 

9. Let V(q) denote the voltage required to place a charge q on the plates of a condensor. The work 
required to charge a condensor from q = aio q = bis defined to be the integral V(q) dq. 
If the voltage is proportional to the charge, prove that the work done to place a charge Q on 
an uncharged condensor is V(Q). 


2.16 Average value of a function 

In scientific work it is often necessary to make several measurements under similar 
conditions and then compute an average or mean for the purpose of summarizing the data. 
There are many useful types of averages, the most common being the arithmetic mean. If 
a v a 2 > . . , a > are n real numbers, their arithmetic mean a is defined by the equation 


(2.17) 


= :t' 

n k = 1 


If the numbers a k are the values of a functionfat n distinct points, say a k = f(x k ), then the 
number 


l 

n 


n 


is the arithmetic mean of the function values /(X l), . . . ,f(x n ). We can extend this concept 
to compute an average value not only for a finite number of values off(x) but for all values 
off(x) where x runs through an interval. The following definition serves this purpose. 


Definition OF AVERAGE value OF A FUNCTION ON AN interval . If f is integrable on 
an interval [a, b], we define A(f ), the average value off on [a, b], by the formula 

Mf) = r 1 — P/« dx. 
b — a J a 


(2.18) 
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When / is nonnegative, this formula has a simple geometric interpretation. Written in 
the form ( b — a)A(f) = f(x) dx, it states that the rectangle of altitude A(f) and base 
[a, b\ has an area equal to that of the ordinate set off over [a, b]. 

Now we can show that formula (2.18) is actually an extension of the concept of the 
arithmetic mean. Let / be a step function which is constant on n equal subintervals of 
[a, b]. Specifically, let x k = a + k(b •» d)fn for k = 0, 1, 2, , n, and suppose that 

f(x) — f(x k ), if x k _ x < x < x k . Then x k — x k 1 = (b — a)/n, so we have 


A(f) = 



f(x) dx = 



1 

n 


2 >,). 


Thus, for step functions, the average A(f) is the same as the arithmetic mean of the values 
f(x i), . , f(x,) taken on the intervals of constancy. 

Weighted arithmetic means are often used in place of the ordinary arithmetic mean in 
(2.17). If Wi, w 2 , . . . , w n are n nonnegative numbers (called weights), not all zero, the 
weighted arithmetic mean a of a,. a 2 , . . . , a, is defined by the formula 


n 


I>A 

k= 1 



When the weights are all equal, this reduces to the ordinary arithmetic mean. The extension 
of this concept to integrable functions is given by the formula 


(2.19) 


A{f) 


b w(x)f(x) dx 
h 

w(x) dx 

J3 


where w is a nonnegative weight function with w(x) dx 5 ^ 0 . 

Weighted averages are widely used in physics and engineering, as well as in mathematics. 
For example, consider a straight rod of length a made of a material of varying density. 
Place the rod along the positive x-axis with one end at the origin 0, and let m(x) denote the 
mass of a portion of the rod of length x, measured from 0. If m(x) = Jo p(t) dt for some 
integrable function p (p is the Greek letter rho ), then p is called the mass density of the rod. 
A uniform rod is one whose mass density is constant. The integral |'J5 xp(x) dx is called the 
first moment of the rod about 0, and the center of mass is the point whose x-coordinate is 


'a 

xp(x ) dx 

X ~ i"p(x) dx ' 

Jo 

This is an example of a weighted average. We are averaging the distance function / (x) = x 
with the mass density p as weight function. 
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The integral Jg x 2 p(x) dx is called the second moment, or moment of inertia, of the rod 
about 0, and the positive number r given by the formula 



is called the radius of gyration of the rod. In this case, the function being averaged is the 
square of the distance function, f(x) = x 2 , with the mass density p as the weight function. 

Weighted averages like these also occur in the mathematical theory of probability where 
the concepts of expectation and variance play the same role as center of mass and moment 
of inertia. 

2.17 Exercises 

In Exercises 1 through 10, compute the average A(f) for the given functionfover the specified 
interval. 

1. f(X) = x 2 , a <, x < b. 6. f(xj = cos x, — tt/2 <x < w/2. 

2. f(x) = x 2 + x 3 , 0 < x < 1. 7. f(x) = sin 2x, 0 <x < tt/I. 

3. f(x) = *i/2, 0 < x < 4. 8. f(x) = sinx cos x, 0 <x < tt/4, 

4. f(x) = *1/3 1 ^ x ^ 8, 9. f(x) = sin 2 X, 0 < X < rrjl, 

5. f(x) = sin x, 0 < x < w/2. 10. f(x) = cos 2 x, 0 < x < w. 

11. (a) If/(x) = x 2 for 0 <x <: a, find a number c satisfying 0 < c < a such that /(c) is equal to 
the average off in [0, a ]. 

(b) Solve part (a) if/(x) = x", where n is any positive integer. 

12. Let /(x) = x 2 for 0 <, x < 1. The average value off on [0, 1] is L. Find a nonnegative weight 
function w such that the weighted average off on [0, 1], as defined by Equation (2.19) is 

(a) *; (b) f; (c) 8. 

13. Let A (f) denote the average of/over an interval [a, b]. Prove that the average has the following 
properties : 

(a) A dditive property: A (/ + g) = A (f) + A (g). 

(b) H omogenousproperty: A(cf) = cA(f) if c is any real number. 

(c) Monotone property: A(f) < A (g) if f <g on [a, b]. 

14. Which of the properties in Exercise 13 are valid tor weighted averages as defined by Equation 

(2.19) ? 

15. Let A%(f) denote the average off on an interval fa, b], 

(a) If a < c < b, prove that there is a number t satisfying 0 < t < 1 such that A b (f) = 
tA c a (f) + (1 — t)A h c {f). Thus, A h a {f) is a weighted arithmetic mean of A c a (f) and A b (f). 

(b) Prove that the result of part (a) also holds for weighted averages as defined by Equation 

(2.19) . 

Each of Exercises 16 through 2 1 refers to a rod of length L placed on the x-axis with one end at 
the origin. For the mass density p as described in each case, calculate (a) the center of mass of the 
rod, (b) the moment of inertia about the origin, and (c) the radius of gyration. 

16. p(x) =1 for 0 < x < L. 

L L 

17. p(x) =1 for 0 ^x <-, p(x) -2 for ~< X < L. 

18. p(x) = x for 0 <x <L. 

L L L 

19. p(x) = x for 0 ^x <-, p(x) = - for - < x < L. 

z. 2, Zt 
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20. p(x) = x 2 for 0 < x < L. 

L I? L 

21. pfx) = x 2 for 0 < * < - , p(x) = — for — < x < L. 

22. Determine a mass density p so that the center of mass of a rod of length L will be at a distance 
Lj 4 from one end of the rod. 

23. In an electrical circuit, the voltage eft) at time t is given by the formula eft) = 3 sin 2t. Cal- 
culate the following: (a) the average voltage over the time interval [0, vr/2] ; (b) the root-mean- 
square of the voltage; that is, the square root of the average of the function e 2 in the interval 
[0, ir/2]- 

24. In an electrical circuit, the voltage e{t ) and the current ifr) at time t are given by the formulas 
eft) = 160 sin t, ift) = 2 sin (/ — tt/6). The average power is defined to be 



e(t)i(t) dt , 


where T is the period of both the voltage and the current. Determine T and calculate the 
average power. 


2.18 The integral as a function of the upper limit. Indefinite integrals 

In this section we assume thatf is a function such that the integral §*f(t) dt exists for each 
X in an interval [a, b]. We shall keep a and f fixed and study this integral as a function of x. 
We denote the value of the integral by Afx), so that we have 

(2.20) A(x) = dt if a<x<b. 

An equation like this enables us to construct a new function A from a given function f, the 
value of A at each point in [a, b] being determined by Equation (2.20). The function A is 
sometimes referred to as an indejnite integral off, and it is said to be obtained from f by 
integration. We say an indefinite integral rather than the indefinite integral because A also 
depends on the lower limit a. Different values of a will lead to different functions A. If we 
use a different lower limit, say c, and define another indefinite integral F by the equation 

F(x)= I */(0 dt , 

J C 


then the additive property tells us that 


A(x) — F(x) = f f(t) dt — f(t) dt = f f(t)dt, 
Ja Jc Jo, 


and hence the difference A(x) — F(x) is independent of x. Therefore any two indefinite 
integrals of the same function differ only by a constant (the constant depends on the choice 
of a and c). 

When an indefinite integral off is known, the value of an integral such as j if (t) dt may 
be evaluated by a simple subtraction. For example, if n is a nonnegative integer, we have 
the formula of Theorem 1.15, 



n + 1 ’ 
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and the additive property implies that 




b n+i _ a n+i 

n + 1 


In general, if F(x) — J* f(t) dt, then we have 


(2.21) J7( 0 dt = j7(0 dt - f°/(0 dt = F(b) - Fla). 

A different choice of c merely changes F(x) by a constant; this does not alter the difference 
F(b) — F(a), because the constant cancels out in the subtraction. 

If we use the special symbol 

mil 

to denote the difference F(b) F(a), Equation (2.21) may be written as 

£/(*) dx = F(*)|J = F(b) - F(a) 

There is, of course, a very simple geometric relationship between a function / and its 
indefinite integrals. An example is illustrated in Figure 2.15(a), where / is a nonnegative 
function and the number A(x) is equal to the area of the shaded region under the graph of 
/from a to x. If /assumes both positive and negative values, as in Figure 2.15(b), the 
integral A(x) gives the sum of the areas of the regions above the x-axis minus the sum of 
the areas below the x-axis. 

Many of the functions that occur in various branches of science arise exactly in this way, 
as indefinite integrals of other functions. This is one of the reasons that a large part of 
calculus is devoted to the study of indefinite integrals. 

Sometimes a knowledge of a special property off implies a corresponding special property 
of the indefinite integral. For example, if / is nonnegative on [a, b], then the indefinite 
integral A is increasing, since we have 


A (y) - A{x) = [7(0 dt - [7(0 dt = [7(0 dt > 0 , 

J a J a « x 




I 


f(t) dt = algebraic sum of areas 


(a) 


(b) 


Figure 2.15 Indefinite integral interpreted geometrically in terms of area. 
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(a) A convex function (b) A concave function 

Figure 2.16 Geometric interpretation of convexity and concavity. 

whenever a < x < y < h. Interpreted geometrically, this means that the area under the 
graph of a nonnegative function from a to x cannot decrease as x increases. 

Now we discuss another property which is not immediately evident geometrically. 
Suppose /is increasing on [a, b]. We can prove that the indefinite integral A has a property 
known as convexity. Its graph bends upward, as illustrated in Figure 2.16(a); that is, the 
chord joining any two points on the graph always lies above the graph. An analytic 
definition of convexity may be given as follows. 

definition of a convex function. A function g is said to be convex on an interval 
[a, b] if, for all x and y in [a, b] andfor every CL satisfying 0 •< a < 1, we have 

(2.22) g(z) < ag(y) + (1 — a )g(x), where z = a y + (1 — a)* . 

We say g is concave on [a, b] if the reverse inequality holds, 

g(z ) > a g(y) + (1 — oc)g(x), where z — tty + (1 — a)x . 

These inequalities have a simple geometric interpretation. The point z = ajF + (1 — a)x 
satisfies z — x = a(y — x). If x < y, this point divides the interval [x, y] into two sub- 
intervals, [x, z] and [z, y], the length of [x, z] being a times that of [x, y]. As a runs from 0 
to 1, the point ag(y ) + (1 ■- a)g(x) traces out the line segment joining the points (x, g(x)) 
and (y, g(y)) on the graph of g. Inequality (2.22) states that the graph of g never goes above 
this line segment. Figure 2.16(a) shows an example with a = |. For a concave function, 
the graph never goes below the line segment, as illustrated by the example in Figure 2.16(b). 

theorem 2.9. Let A(x) = §*f(t) dt. Then A is convex on every interval where f is in- 
creasing, and concave on every interval where f is decreasing. 

Proof. Assume f is increasing on [a, b], choose x < y, and let z = ay + (1 — a)x. We 
are to prove that A(z) < xA(y ) + (1 — a)A(x). Since A(z) = a A(z) + (1 — a )A(z), this 
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is the same as proving that a A(z) + (l — <x)A(z) < ctA(y) + (1 — a )A(x), or that 
(1 - a)[/((z) - A(x)] < a [A(y) - A(z )] . 

Since we have A(z) — A(x) = j"*/(f) dt and A(y) — A(z) = j*f (t) dt, we are to prove that 

(2.23) (1 - a) J7(0 dr < a [ f(t) dt . 

But f is increasing, so we have the inequalities 

fit) If(z) if x < t <; z, and f(z) <f(t) if z < t < y . 
Integrating these inequalities we find 

f /(f) dt </(z)(z - x), and f(z)(y - z) < [*/(/) dt . 

*>X J Z 

But (1 — a )(z — x) = afj' — z), so these inequalities give us 

(1 - «)£ f{t) dt<{ 1 - oc )/(z)(z - x) = <x/(z)(y - z) < «|V(f) dt , 

which proves (2.23). This proves that A is convex when/is increasing. When fis decreasing, 
we may apply the result just proved to — f 

EXAMPLE. The cosine function decreases in the interval [0, 7r], Since sin x = cos t dt, 
the graph of the sine function is concave in the interval [0, n]. In the interval [rr, 27r], the 
cosine increases and the sine function is convex. 

Figure 2.17 illustrates further properties of indefinite integrals. The graph on the left is 
that of the greatest-integer function, fix) = [x]; the graph on the right is that of the 
indefinite integral A(x) = [t] dt. On those intervals where / is constant, the function A 

is linear. We describe this by saying that the integral of a step function is piecewise linear. 




A(x) = f[t\ 

Jo 


dt 


Figure 2.17 The indefinite integral of a step function is piecewise linear. 
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Observe also that the graph off is made up of disconnected line segments. There are 
points on the graph offwhere a small change in x produces a sudden jump in the value of 
the function. Note, however, that the corresponding indefinite integral does not exhibit 
this behavior. A small change in x produces only a small change in A(x). That is why the 
graph of A is not disconnected. This illustrates a general property of indefinite integrals 
known as continuity. In the next chapter we shall discuss the concept of continuity in 
detail and prove that the indefinite integral is always a continuous function. 


2.19 Exercises 

Evaluate the integrals in Exercises 1 through 16. 


1. (1 + t + r 2 ) dt. 

C2y 

I (1 + t + / 2 ) dt. 
r 2 x 

. I (1 + t + t 2 ) dt. 

J*”* (1 -2t + It*) dt. 

t\t* + 1) dt. 

6. f"’ 2 0 2 + l) 2 dt. 

J X 

7. J* 0 1 ' 2 + 1) dt, x >0. 

8. I"* 2 (tl/2 + ,1/4) dt, x > 0. 

J X 

17. Find all real values of x such that 




cos t dt. 


9-J* ' 

J —IT 

fx 2 , 

10. I (|+ cos t)dt. 

11. j (| — sin t) dt. 

12. J o (« 2 + sin 3 u) du. 

fx 2 

(v* + sin 3v) dv. 

[V 

(sin 2 x +x) dx. 


13 

14. 

15. 


■r 

16. ("* (| + cos tf dt. 

J — IT 


sin 2 w + cos - uw. 


.D ' 3 t)dt=\\ X v At-t*)dt. 

Draw a suitable figure and interpret the equation geometrically. 

18. Let f(x) = x — [x] — | if x is not an integer, and let f(x) = 0 if x is an integer. (As usual, 
[x] denotes the greatest integer < x.) Define a new function P as follows: 


P(x) = \j{t) dt 


for every real x . 


(a) Draw the graph off over the interval [ -3, 3] and prove that f is periodic with period 1: 
f{x + 1) = fix) for all x 

(b) Prove that P(x) = l(x 2 — x), if 0 < x < 1 and that P is periodic with period 1. 

(c) Express P(x) in terms of [v], 

(d) Determine a constant c such that (P(t) + c) dt = 0. 

(e) For the constant c of part (d), let Q(x) = jg (P(t) + c) dt. Prove that Q is periodic with 
period 1 and that 


x 3 x 2 

e«=6-4 + 


X 

12 


if 0 < x < 1 
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19. Given an odd function f, defined everywhere, periodic with period 2, and integrable on every 

interval. Let g(x) = j ’*/(;) dt. 

(a) Prove that g(2n) = 0 for every integer n. 

(b) Prove that g is even and periodic with period 2. 

20. Given an even function f defined everywhere, periodic with period 2, and integrable on every 

interval. Let g(x) = J* f (t) dt, and let A = g(l). 

(a) Prove that g is odd and that g(x + 2) — g(x) = g(2). 

(b) Computeg(2) and g(5) in terms of A. 

(c) For what value of A will g be periodic with period 2 ? 

21. Given two functions f and g, integrable on every interval and having the following properties : 

f is odd, g is even, f(5) = 7, f( 0 ) = 0, g(x) = f(x + 5),f(x) = Jg g(t) dt for all x Prove 

that (a) f{x - 5) = -g(x) for all x; (b) ^f (t) dt = 7; (c) jfcf(t) dt = g(O) - g(x). 
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CONTINUOUS FUNCTIONS 


3.1 Informal description of continuity 

This chapter deals with the concept of continuity, one of the most important and also 
One of the most fascinating ideas in all of mathematics. Before we give a precise technical 
definition of continuity, we shall briefly discuss the concept in an informal and intuitive 
way to give the reader a feeling for its meaning. 

Roughly speaking, the situation is this: Suppose a function f has the value f(p) at a 
certain point p. Then /is said to be continuous at p if at every nearby point x the function 



(a) A jump discontinuity at each integer. (b) An infinite discontinuity at 0. 

Figure 3.1 Illustrating two kinds of discontinuities. 


value f (x) is close to f ( p). Another way of putting it is as follows: If we let x move toward 
p , we want the corresponding function values f ( X ) to become arbitrarily close to f(p), 
regardless of the manner in which x approaches p. We do not want sudden jumps in the 
values of a continuous function, as in the examples in Figure 3.1. 

Figure 3.1(a) shows the graph of the function /defined by the equation f (x) = x — [x], 
where [x] denotes the greatest integer <x. At each integer we have what is known as a 
jump discontinuity. For example, f(2) = 0, but as x approaches 2 from the left, f(x) 
approaches the value 1, which is not equal to/ (2). Therefore we have a discontinuity at 2. 
Note that fix) does approach f(2) if we let x approach 2 from the right , but this by itself 
is not enough to establish continuity at 2. In a case like this, the function is called continuous 
from the right at 2 and discontinuous from the left at 2. Continuity at a point requires both 
continuity from the left and from the right. 
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In the early development of calculus almost all functions that were dealt with were 
continuous and there was no real need at that time for a penetrating look into the exact 
meaning of continuity. It was not until late in the 18th Century that discontinuous functions 
began appearing in connection with various kinds of physical problems. In particular, the 
work of J. B. J. Fourier (1758-1830) on the theory of heat forced mathematicians of the 
early 19th Century to examine more carefully the exact meaning of such concepts as function 
and continuity. Although the meaning of the word “continuous” seems intuitively clear 
to most people, it is not obvious how a good definition of this idea should be formulated. 
One popular dictionary explains continuity as follows : 

Continuity: Quality or state of being continuous. 

Continuous: Having continuity of parts. 

Trying to learn the meaning of continuity from these two statements alone is like trying to 
learn Chinese with only a Chinese dictionary. A satisfactory mathematical definition of 
continuity, expressed entirely in terms of properties of the real-number system, was first 
formulated in 1821 by the French mathematician, Augustin-Louis Cauchy (1789-1857). 
His definition, which is still used today, is most easily explained in terms of the limit concept 
to which we turn now. 


3.2 The definition of the limit of a function 

Let f be a function defined in some open interval containing a point p, although we do 
not insist that f be defined at the point p itself. Let A be a real number. The equation 

lim/W = A 


is read: “The limit off(x), as x approaches p , is equal to A,” or “f(x) approaches A as x 
approaches p." It is also written without the limit symbol, as follows: 

f(x) A as x ->p . 

This symbolism is intended to convey the idea that we can make f ( X ) as close to A as we 
please, provided we choose x sufficiently close to p. 

Our first task is to explain the meaning of these symbols entirely in terms of real numbers. 
We shall do this in two stages. First we introduce the concept of a neighborhood of a point, 
then we define limits in terms of neighborhoods. 

definition of neighborhood of a point . Any open interval containing a point p as 

its midpoint is called a neighborhood of p. 

Notation. We denote neighborhoods by N(p), N x (p), N,(p), etc. Since a neighborhood 
N(p) is an open interval symmetric about P, it consists of all real x satisfying p — r < X < 
p 4 - r for some r > 0. The positive number r is called the radius of the neighborhood. W e 
designate N(p) by N(p; r) if we wish to specify its radius. The inequalities p — r < X < 
p + r are equivalent to -r < x — p < r, and to |x — p\ < r. Thus, N(p; r) consists of 
all points x whose distance from p is less than r . 
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In the next definition, we assume that A is a real number and thatfis a function defined 
on some neighborhood of a point p (except possibly at p). The function f may also be 
defined at p but this is irrelevant in the definition. 

definition of limit of a function. The symbolism 

lim/(x)=A [or f(x) -> A as x-p] 
x-*v 

means that for every neighborhood Ni(A) there is some neighborhood N,(p) such that 

(3.1) f{x)eNfA) whenever x E Nfp) and x 5 ^ p . 

The first thing to note about this definition is that it involves two neighborhoods, N,(A ) 
and N.(p). The neighborhood N,(A) is specified first; it tells us how close we wish f(x) to 




Figure 3.2 Here lim fix) = A, but there 
x-+p 

is no assertion about f at p. 


Figure 3.3 Here f is defined at p and 
lim fix) = f(p), hence f is continuous at p. 
x-*p 


be to the limit A. The second neighborhood, N,(p), tells us how close x should be to p so 
that f(x) will be within the first neighborhood N.(A). The essential part of the definition 
is that, for every’ NfA), no matter how small, there is some neighborhood N,(p) to satisfy 
(3.1). In general, the neighborhood N,(p) will depend on the choice of N, (A). A neighbor- 
hood N,(p) that works for one particular NfA) will also work, of course, for every larger 
N,(A), but it may not be suitable for any smaller N,(A). 

The definition of limit may be illustrated geometrically as in Figure 3.2. A neighborhood 
NfA) is shown on the y-axis. A neighborhood N,(p) corresponding to NfA) is shown on 
the x-axis. The shaded rectangle consists of all points (x, y) for which x £ N,(p) and 
y £ NfA). The definition of limit asserts that the entire graph offabove the interval N,(p) 
lies within this rectangle, except possibly for the point on the graph above p itself. 
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The definition of limit can also be formulated in terms of the radii of the neighborhoods 
N,(A) and N,(p). It is customary to denote the radius of N,(A) by e (the Greek letter epsilon ) 
and the radius of N,(p) by d (the Greek letter delta). The statement fix) E N,(A) is equivalent 
to the inequality \f{x) — A \ < €, and the statement x E N,(p), x ^ p, is equivalent to the 
inequalities 0 < \x — p\ < 6. Therefore, the definition of limit can also be expressed as 
follows : 

The symbol lim^ v f(x) = A means that for every € > 0, there is a 6 > 0 such that 
(3.2) ( f{x) — A\ < e whenever 0 < |x >— p\ < 6 ■ 

We note that the three statements, 

lim/(x) = A , lim (f(x) — A) = 0 , lim If(x) — A( = 0 , 

x->v x-+v X~*P 

are all equivalent. This equivalence becomes apparent as soon as we write each of these 

statements in the e, (5-terminology (3.2). 

In dealing with limits as x -+p, we sometimes find it convenient to denote the difference 
x — p by a new symbol, say h, and to let h -> 0. This simply amounts to a change in 
notation, because, as can be easily verified, the following two statements are equivalent: 

lim f{x) = A , ]\mf(p + h) = A . 

x->y h-* 0 

example l. Limit Of a constant function. Let f ( X ) = C for all x. It is easy to prove 

that for every p, we have lim x _, ,/(.!: ) = c. In fact, given any neighborhood Nfc), relation 
(3.1) is trivially satisfied for any choice of N.f p) because f (x) = c for all x and c £ Nfc) for 
all neighborhoods N,(c). In limit notation, we write 

lim c = c . 

example 2. Limit of the identity function. Here f(x) i= x for all x. We can easily prove 
that \\m x ^ v f(x) : p. Choose any neighborhood Nfp) and take Nfp) = N,(p). Then 
relation (3.1) is trivially satisfied. In limit notation, we write 

lim x = p . 

x->p 

“One-sided” limits may be defined in a similar way. For example, if / (x) — > A as x — > p 
through values greater thanp, we say that A is the right-hand limit off at p, and we indicate 
this by writing 

lim f(x) = A . 

In neighborhood terminology this means that for every neighborhood NfA), there is some 
neighborhood N 2 ip) such that 

(3 3 ) fix) E NfA) whenever x £ Nfip) and r > p. 
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Left-hand limits, denoted by writing x are similarly defined by restricting x to 

values less than p. 

Iffhas a limit A at p, then it also has a right-hand limit and a left-hand limit at p, both 
of these being equal to A. But a function can have a right-hand limit at p different from the 
left-hand limit, as indicated in the next example. 

example 3. Letf(x) = [x] for all x, and let p be any integer. For x near p, x < p, we 
have f(x) = p — 1, and for x near p, x > P, we have f(x) = p. Therefore we see that 

lim f(x) = p — 1 and lim f(x) = p . 
x->p- a:-*3>+ 

In an example like this one, where the right- and left-hand limits are unequal, the limit of 
fat p does not exist. 

example 4. Let f(x) = 1 /jc 2 if x ^ 0, and let f(Q) = 0. The graph off near zero is 
shown in Figure 3.1(b). In this example.ftakes arbitrarily large values near 0 so it has no 
right-hand limit and no left-hand limit at 0. To prove rigorously that there is no real number 
A such that lim x _ 0+ /(;t) = A, we may argue as follows: Suppose there were such an A, 
sa y A > 0. Choose a neighborhood N,(A) of length 1. In the interval 0 < x < l /(A + 2), 
we have f(x) — 1/x 2 > (A + 2) 2 > A + 2, so/(x) cannot lie in the neighborhood N,(A). 
Thus, every neighborhood N(0) contains points x > 0 for which f(x) is outside N,(A), so 
(3.3) is violated for this choice of N,(A). Hencefhas no right-hand limit at 0. 

example 5. Let f(x) = 1 if x ^ 0, and let f(0) ~ 0. This function takes the constant 
value 1 everywhere except at 0, where it has the value 0. Both the right- and left-hand 
limits are 1 at every point p, so the limit off(x), as x approaches p, exists and equals 1. 
Note that the limit of/ is 1 at the point 0, even though /(0) = 0. 

3.3 The definition of continuity of a function 

In the definition of limit we made no assertion about the behavior off at the point p 
itself. Statement (3.1) refers to those x ^ p which lie in N 2 (p), so it is not necessary that 
f be defined at p. Moreover, even if / is defined at p, its value there need not be equal to 
the limit A. However, if it happens thatf is defined atp and if it also happens thatf(p) = A, 
then we say the function / is continuous at p. In other words, we have the following 
definition. 


DEFINITION OF CONTINUITY OF A FUNCTION AT A POINT. A flUCtiO) l f is Said tO be COH- 

tinuous at a point p if 

(a) f i S defined at p, and 

(b) lim/(x) =f(p). 

x-+v 

This definition can also be formulated in terms of neighborhoods. A function f is 
continuous at p if for every neighborhood Nj[f (p)] there is a neighborhood Nfp) such that 

f{x)eN 1 [f {p)\ whenever x e N.fp) . 


(3.4) 
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Since ftp) always belongs to Nftflp)], we do not need the condition x ^ pin (3.4). In 
the e, S-terminology, where we specify the radii of the neighborhoods, the definition of 
continuity can be restated as follows: 

A function / is continuous at p if for every € > 0 there is a <3 > 0 such that 

I/M -/Ml < * whenever )x — < <3. 

The definition of continuity is illustrated geometrically in Figure 3.3. This is like Figure 
3.2 except that the limiting value. A, is equal to the function value f (p) so the entire graph 
off above Nftp) lies in the shaded rectangle. 

example 1. Constant functions are continuous everywhere. If f{x) = c for all x, then 


lim f(x) = lim c = c = f(p) 

a :-*}) x->p 

for every p, so /is continuous everywhere. 


example 2. The identity function is continuous everywhere. If ftx) = x for all x, we have 


lim/(x) = lim x = p =f(p) 
x-<j> x-*p 

for every p, so the identity function is continuous everywhere. 

example 3. Let ftx) = fx] for all x. This function is continuous at every pointp which 
is not an integer. At the integers it is discontinuous, since the limit off does not exist, the 
right- and left-hand limits being unequal. A discontinuity of this type, where the right- and 
left-hand limits exist but are unequal, is called a jump discontinuity. However, since the 
right-hand limit equals f (p) at each integer p, wesaythat/is continuous from the right at p. 

example 4. The function / for which ftx) = ]/x 2 for s ^0, f (0) = 0, is discontinuous 

at 0. [See Figure 3.1(b).] We say there is an infinite discontinuity at 0 because the function 

takes arbitrarily large values near 0. 

example 5. Let ftx) = 1 for x ^ 0, f (0) =0. This function is continuous everywhere 

except at 0. It is discontinuous at 0 because f (0) is not equal to the limit off (x) as x — »■ 0. 

In this example, the discontinuity could be removed by redefining the function at 0 to have 
the value 1 instead of 0. For this reason, a discontinuity of this type is called a removable 
discontinuity. Note that jump discontinuities, such as those possessed by the greatest-integer 
function, cannot be removed by simply changing the value off at one point. 


3.4 The basic limit theorems. More examples of continuous functions 

Calculations with limits may often be simplified by the use of the following theorem 
which provides basic rules for operating with limits. 
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tmeorem 3.1. Let J and g be functions such that 

lim fix) — A , lim g(x) — B . 

x-+p x-^p 


Then we have 

(i) lim [f(x) + g(x)] = A + B , 
x-*v 

(ii) lim [f(x) - g(x)] = A - B , 

X”*p 

(iii) lim / (x) . g(x) = A . B , 
x->y 

(iv) lim/(x)/g(x) = AfB if B 0. 

Note: An important special case of (iii) occurs whenfis constant, say f{x) = A for 
all x. In this case, (iii) is written as lim A ■ g(x) - A B. 

X-+P 

The proof of Theorem 3.1 is not difficult but it is somewhat lengthy so we have placed 
it in a separate section (Section 3.5). We discuss here some simple consequences of the 
theorem. 

First we note that the statements in the theorem may be written in a slightly different 
form. For example, (i) can be written as follows: 

lim [f(x) + g(x)] = lim f(x) + lim g(x) . 
x-+v x-*p x->p 

It tells us that the limit of a sum is the sum of the limits. 

It is customary to denote by / + g, f — g, f ■ g , and fig the functions whose values at 
each x under consideration are 

f(x) + g(x), fix) - gix), f(x). gix), and /(x)/g(x) , 

respectively. These functions are called the sum, difference, product, and quotient off and 
g. Of course, the quotient fjg is defined only at those points for which g(x) 5 ^ 0. The 
following corollary to Theorem 3.1 is stated in this terminology and notation and is 
concerned with continuous functions. 


theorem 3.2. Let f and g be continuous at a point p. Then the sum f + g, the difference 
f — g, and the product f ' g are also continuous ut p. The same is true of the quotient fig if 
g(p ) ^ 0. 

Proof. Since f and g are continuous at p, we have lim.^^ f (x) — f (p) and lim,, v g(x) — 
g(p). Therefore we may apply the limit formulas in Theorem 3.1 with A = f(p) and 
B = g(p) to deduce Theorem 3.2. 




The basic limit theorems. More examples of continuous junctions 


133 


We have already seen that the identity function and constant functions are continuous 
everywhere. Using these examples and Theorem 3.2, we may construct many more examples 
of continuous functions. 


example 1. Continuity of polynomials. If we take f(x) = g(x) = x, the result on conti- 
nuity of products proves the continuity at each point for the function whose value at each 
x is x 2 . By mathematical induction, it follows that for every real c and every positive integer 
«, the function / for which f(x) = ex' 1 is continuous for all x. Since the sum of two con- 
tinuous functions is itself continuous, by induction it follows that the same is true for the 
sum of any finite number of continuous functions. Therefore every polynomial p(x) = 
c k x k is continuous at all points. 

EXAMPLE 2. Continuity of rational functions. The quotient of two polynomials is called a 
rationalfunction. If r is a rational function, then we have 


r(x) = 


P(x) 

q(x) ’ 


where p and q are polynomials. The function r is defined for all real x for which q(x) 0. 
Since quotients of continuous functions are continuous, we see that every rational function 
is continuous wherever it is defined. A simple example is r(x) = 1 fx if x 5 ^ 0. This function 
is continuous everywhere except at x = 0, where it fails to be defined. 

The next theorem shows that if a function g is squeezed between two other functions 
which have equal limits as x — ► p, then g also has this limit as x — »■ p. 


THEOREM 3.3. SQUEEZING PRINCIPLE. Suppose that f(x) < g(x) < h(x) for all X p 

in some neighborhood N(p). Suppose also that 


lim / (x) = lim h(x) = a . 

x~*p x~>v 

Then vt’C also haue lim,,.^ g(x) = a. 

Proof. Let G(x) = g(x) -f(x), and H(x) = h(x) -f(x). The inequalities / < g < h 
imply 0<g-f<h~ f or 

0 < G(x) < H(x) 

for all x ^ p in N(p). To prove the theorem, it suffices to show that G(x) — r 0 as x 
given that H(x) -> 0 as x —>■ p. 

Let N,(Oj be any neighborhood of 0. Since H(x) — > 0 as x —> p, there is a neighborhood 
N.(p) such that 

H(x) e Nf 0) whenever x £ N,(p) and x 5 ^ p . 

We can assume that N,(p) £ N(p). Then the inequality 0 < G < H states that G(x) is no 
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further from 0 than H(x) if x is in Nf, p), X 7 & p. Therefore G(x) e N,(Oj for such x, and 
hence G(x) — > 0 as x — p. This proves the theorem. The same proof is valid if all the 
limits are one-sided limits. 

The squeezing principle is useful in practice because it is often possible to find squeezing 
functions f and h which are easier to deal with than g. We shall use the result now to prove 
that every indefinite integral is a continuous function. 

theorem 3.4. CONTINUITY of INDEFINITE integrals. Assume f is integrable on fa, x] 
for every x in [a, b ], and let 

A(x) = f f(t) dt . 

J a 

Then the indefinite integral A is continuous at each point of [a, b /. (At each endpoint ire have 
one-sided continuity.) 

Proof. Choose p in [a, b]. We are to prove that A(x) — > A(p) as x —*■ p. We have 


(3.5) A(x) - A(p ) = £ f(t) dt , 

Now we estimate the size of this integral. Sincefis bounded on [a, b], there is a constant 
M > 0 such that —M <f(t) < A4 for all t in [a. b]. If x > p, we integrate these inequalities 
over the interval [p. x] to obtain 

-M(x - p)< A(x) - A(p) < M(x - p) . 

If x < p, we obtain the same inequalities with x — p replaced by p — x. Therefore, in 
either case we can let x — >■ p and apply the squeezing principle to find that A(x) — > A(p). 
This proves the theorem. If p is an endpoint of [a, b], we must let x — > p from inside the 
interval, so the limits are one-sided. 

example 3. Continuity of the sine and cosine. Since the sine function is an indefinite 
integral, sin x =j * COS t dt, the foregoing theorem tells us that the sine is continuous 

everywhere. Similarly, the cosine is everywhere continuous since cos x = 1 — I x sin t dt. 

Jo 

The continuity of these functions can also be deduced without making use of the fact that 
they are indefinite integrals. An alternate proof is outlined in Exercise 26 of Section 3.6. 

example 4. In this example we prove an important limit formula, 


(3.6) 


.. sin x , 

lim = 1 , 

*->0 x 


that is needed later in our discussion of differential calculus. Since the denominator of the 
quotient (sin x)jx approaches 0 as x — > 0, we cannot apply the quotient theorem on limits 
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to deduce (3.6). Instead, we use the squeezing principle. From Section 2.5 we have the 
fundamental inequalities 


0 < cos x < 


sin X 
x 


< 


ee&x’ 


valid for 0 < x < {v. They are also valid for — < x < 0 since cos (-x) = cos x and 

sin (-x) = —sin x, and hence they hold for all x ^ 0 in the neighborhood N(0'. %w). When 
x — > 0, we find cos x — > 1 since the cosine is continuous at 0, and hence l/(cos x) — > 1. 
Therefore, by the squeezing principle, we deduce (3.6). If we define f(x) = (sin x)/x for 
X ^ 0, /( 0) = 1, thenfis continuous everywhere. Its graph is shown in Figure 3.4. 


Y 



Figure 3.4 f(x) = (sin x)jx if x 0, f(0) = 1. This function is continuous everywhere. 


example 5. Continuity off when f (x) = x r for x > 0, where r is a positive rational number. 

From Theorem 2.2 we have the integration formula 



x 1+1/n 

1 + 1 In ’ 


valid for all X > 0 and every integer n >7. Using Theorems 3.4 and 3.1, we find that the 
function A given by ,4(x) = i)+ l l n is continuous at all points p > 0. Now let g(x) = 
X 1 l n = Afc for x > 0. Since g is a quotient of two continuous functions it, too, is 

continuous at all points p > 0. More generally, if fix) = x" l i", where m is a positive 
integer, then J is a product of continuous functions and hence is continuous at all points 
p > 0. This establishes the continuity of the rth-power function, fix) = x r , when r is any 
positive rational number, at all points p > 0. At p = 0 we have right-hand continuity. 

The continuity of the rth-power function for rational r can also be deduced without 
using integrals. An alternate proof is given in Section 3.13. 


3.5 Proofs of the basic limit theorems 

In this section we prove Theorem 3.1 which describes the basic rules for dealing with 
limits of sums, products, and quotients. The principal algebraic tools used in the proof 




136 


Continuous functions 


are two properties of absolute values that were mentioned earlier in Sections 14.8 and 14.9. 
They are (1) the triangle inequality, which states that \a + b | < |fl| + |6| for all real a and 
b, and (2) the equation \ab\ = |fl| | b\, which states that the absolute value of a product is 
the product of absolute values. 

Proofs of(i) und (ii). Since the two statements 

lim f(x) = A and lim [f(x) — A] = 0 
x->p x->p 

are equivalent, and since we have 

/M + g(x) - (A + B) = [fix) - A] + [g(x) - B] , 

it suffices to prove part (i) of the theorem when the limits A and 8 are both zero. 

Suppose, then, that fix) — > 0 and g{x) — >- 0 as x p. We shall prove that f(x) + g(x) — > 0 
as x — >■ p. This means we must show that for every e > 0 there is a b > 0 such that 

(3.7) I/M + gM! < e whenever 0 < \x — p\ < (5 , 

Let e be given. Since f(x) — ► 0 as x —*■ p, there is a dj > 0 such that 


(3.8) I/Ml < “ whenever 0 < lx — p\ < . 

Similarly, since g(x) —^0 as x — >■ p, there is a (5 2 > 0 such that 

(3.9) IgMI < \ whenever 0 < |x - p\ < b 2 

If we let d denote the smaller of the two numbers and b 2 , then both inequalities (3.8) and 
(3.9) are valid if 0 < |x — p\ < b and hence, by the triangle inequality, we find that 


I/M + gMI < I/Ml + IgMI < I + 2 = 6 

This proves (3.7) which, in turn, proves (i). The proof of (ii) is entirely similar, except that 
in the last step we use the inequality |/M g( x )l ^ I/M! + IgMI- 

Proof of (iii). Suppose that we have proved part (iii) for the special case in which one 
of the limits is 0. Then the general case follows easily from this special case. In fact, all 
we need to do is write 


/MgM ~AB= f(x)[g(x) -B] + B[f(x ) - A] . 

The special case implies that each term on the right approaches 0 as x — ► p and, by property 




Proofs of the basic limit theorems 


137 


(i), the sum of the two terms also approaches 0. Therefore, it remains to prove (iii) in the 
special case where one of the limits, say B, is 0. 

Suppose, then, that f(x) — > A and g(x) — ■> 0 as x-*-p. We wish to prove thatf(x)g(x) — » 0 
as X — >■ p • To do this we must show that if a positive e is given, there is a <5 > 0 such that 

(3.10) \f(x)g(x)\ < e whenever 0 < |a' — p\ < d. 

Since f(x) -> A as x ~+ p, there is a 8, such that 

(3.11) | f(x) — A\ < 1 whenever 0 < lx — p\ < <3i . 

For such x, we have |/(.v)| = | f{x) — A + A\ < \f(x) — A\ + \A\ < 1 + \A\, and hence 
(3-12) \f(x)g(x)\ = \f(x)\ \g(x)\ < (1 + \A\) |g(x)|. 

Since g(x) — > 0 as x — * p, for every t > 0 there is a r) 2 such that 


(3.13) |g(x)| < 1 whenever 0 < \x — p\< <5 2 . 

1 + \A\ 

Therefore, if we let b be the smaller of the two numbers dj and e> 2 , then both inequalities 
(3.12) and (3.13) are valid whenever 0 < \x ~ p\ < d, and for such X we deduce (3.10). 
This completes the proof of (iii). 

Proof of (iv). Since the quotient f(x)/g(x) is the product of f(x)/B with B/g(x), it suffices 
to prove that Bjg(x)—> 1 as x — ► p and then appeal to (iii). Let h(x) = g{x)jB. Then 
h(x) — *■ 1 as x — ► p, and we wish to prove that 1 jh(x) — ► 1 as x ~*p- 
Let e > 0 be given. We must show that there is a <3 > 0 such that 


(3.14) 



< € 


whenever 0 < |x — p| < 6 . 


The difference to be estimated may be written as follows. 


(3.15) 



m - 1 

\h(x)\ 


Since h(x) — > 1 as x ^ p, we can choose a (3 > 0 such that both inequalities 

(3.16) | h(x) - 1| < ^ and I h(x) - 1| < | 

are satisfied whenever 0 < \x p\ < b. The second of these inequalities implies h(x) > ^ 
so 1/| h(x)\ = 1 /h(x) < 2 for such x. Using this in (3.15) along with the first inequality in 

(3.16) , we obtain (3.14). This completes the proof of (iv). 
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3.6 Exercises 


In Exercises 1 through 10, compute the limits and explain which limit theorems yOUare using 
in each case. 


1 

1. lim — ; . 


a :— »2 " 


2. lim 
z—o 


25x3 + 2 
75x7 - 2 . 


3. 

25 - 1-2 x ” 2 

4. 

*-►1 

5 lim 

/l-i-O 


x-1 


(/ + h) 2 




/I 

v -2 _ „2 


X 2 - <T 

8 ' x 2 + 2ax + a 2 ’ 

9. lim tan 
<— 0 

10. lim (sin 2t + t 2 COS 5/). 
1-0 

|x| 

11. lim — . 
a:— -0+ ^ 

\x\ 

12. lim 

x— >-0— 


a 5 ^ 0 . 




r — a‘ 

6. lim -s „ , 

z-o x 2 + 2ax + « 2 


7. lim- 


a 5^0. 


X?!0. 


13. lim 
z-o+ * 


14. lim 


«-*o- 


a -o + 2ax + o 2 

Use the relation lim^Q (sin x)/x= 1 to establish the limit formulas in Exercises 15 through 20. 


sin 2x 

15. lim — ^ = 2. 
2-o X 

tan 2x 

16. lim = 2. 

*-o S1-n x 

sin 5x 

17. lim = 5. 


sin 5x sin 3x 
18. lim — 2. 


sin x — sin a 

19. lim = COS a. 


*->■0 


x - a 


-o 


s(n x 


21. Show that lim 
x-o 


1 - VT 


1 - COS X , 

2°. lira -j— = 1 

X— 0 X 


1 

2- 


[Hint: (1 - \fu)( 1 + V m) 


22. A function f is defined as follows: 


f(x) = 


Sin x 

ax +b 


if X < C , 
if X > C , 


where a, b, c are constants. If b and c are given, find all values of a (if any exist) for whichf 


is continuous at the point x = c. 

23. Solve Exercise 22 if f is defined as follows: 


f(x) = 


2 cos x 


if x < c . 


ax 2 + b if x > c . 


24. At what points are the tangent and cotangent functions continuous? 

25. Let f(x) = (tan x)/x if x jt 0. Sketch the graph off over the half-open intervals [ — jw, 0) 
and (0, 4 ^]. What happens toy(x) as x — > O? Can you definef(O) SO thatfbecomes continuous 
at 0? 
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26. This exercise outlines an alternate proof of the continuity of the sine and cosine functions. 

(a) The inequality |sin x| < |jc|, valid for 0 < |x| < was proved in Exercise 34 of Section 
2.8. Use this inequality to prove that the sine function is continuous at 0. 

(b) Use part (a) and the identity cos 2x = 1 — 2 sin 3 x to prove that the cosine is continuous 
at 0. 

(c) Use the addition formulas for sin (x + h ) and cos (x + h) to prove that the sine and cosine 
are continuous at any real x. 

27. Figure 3.5 shows a portion of the graph of the functionfdefined as follows: 

1 

f(x) = sin - if x 0 . 
a: 

For x = 1 /(«7r), where n is an integer, we have sin (1/x) = sin (na) = 0. Between two such 
points, the function values rise to + 1 and drop back to 0 or else drop to — 1 and rise back to 0. 



Figure 3.5 f(x) = sin (1/x) if x ^ 0. This function is discontinuous at 0 no matter 

how f(0) is defined. 


Therefore, between any such point and the origin, the curve has an infinite number of oscilla- 
tions. This suggests that the function values do not approach any fixed value as x — > 0. Prove 
that there is no real number A such thatffx) — >• A as x — > 0. This shows that it is not possible 
to define f (0) in such a way that f becomes continuous at 0. 

[Hint: Assume such an A exists and obtain a contradiction.] 

28 . For x 0, let f(x) = [1/x], where [t] denotes the greatest integer < t, Sketch the graph of 
/over the intervals [ -2, — U and [^,2]. What happens tof(x) as x ->0 through positive 
values? through negative values ? Can you define / (0) SO that / becomes continuous at 0? 

29. Same as Exercise 28, when/(x) = ( — tyn/a:] f orx ^ 0. 

30. Same as Exercise 28, whenffx) = x( — l) tl,x] for x ^ 0. 

31. Give an example of a function that is continuous at one point of an interval and discontinuous 
at all other points of the interval, or prove that there is no such function. 

32. Letf(x) = x sin (1/x) if x ^ 0. Define /( 0) so thatfwill be continuous at 0. 

33. Letf be a function such that | f(u) — f(v)\ < \ u — v\ for all u and v in an interval [a, b]. 

(a) Prove that /is continuous at each point of [a, b]. 

(b) Assume that/is integrable on [a, b]. Prove that 

[ b j\x)dx-(b-a)j\a) 

Ja 



2 
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(c) More generally, prove that for any c in [a, b], we have 

f f( x ) dx — (b — a)f(c) < — — . 

Jf I 2 

3.7 Composite functions and continuity 

We can create new functions from given ones by addition, subtraction, multiplication, 
and division. In this section we learn a new way to construct functions by an operation 
known as composition. We illustrate with an example. 

Let f(x) = sin (x 2 ). To compute f(x), we first square x and then take the sine of x 2 . 
Thus, f(x) is obtained by combining two other functions, the squaring function and the 
sine function. If we let v(x) = x 2 and u(x) = sin x, we can express fix) in terms of u and v 
by writing 

fix) = u[vix )] . 

We say that f is the composition of u and v (in that order). If we compose v and u in the 
opposite order, we obtain a different result, v[u(x)] = (sin x) 2 . That is, to compute v[u(x)], 
we take the sine of x first and then square sin x. 

Now we can carry out this process more generally. Let u and v be any two given functions. 
The composite or composition of u and v (in that order) is defined to be the functionffor 
which 


fix) — u[v{x)] (read as “u of v of x") • 

That is, to evaluatef at x we first compute v(x) and then evaluate u at the point v(x). Of 
course, this presupposes that it makes sense to evaluate u at v(x), and therefore f will be 
defined only at those points x for which n(x) is in the domain of u. 

For example, if u(x ) = Vx and v(x) = f — x 2 , then the composite f is given by fix) 
VT — x 2 . Note that v is defined for all real x, whereas u is defined only for x > 0. There- 
fore the composite f is defined only for those x satisfying I — x 2 > 0. 

Formally, fix) is obtained by substituting v(x) for x in the expression u(x). For this 
reason, the function / is sometimes denoted by the symbol / = u{v) (read as “ u of v”). 
Another notation that we shall use to denote composition is f = u o v (read as “u circle 
v”). This resembles the notation for the product u . v. In fact, we shall see in a moment 
that the operation of composition has some of the properties possessed by multiplication. 

The composite of three or more functions may be found by composing them two at a 
time. Thus, the function f given by 

fix) = COS [sin (x 2 )] 
is a composition, / = « c (c t vv), where 

u(x) = COS x , u(x) = sin x , and vt’(x) = X 2 • 

Notice that the same f can be obtained by composing u and v first and then composing u op 
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with w, thus: f = (« c v) o w. This illustrates the associative law for composition which 
states that 

(3.17) u o (v o w) = (u o d) o w 

for all functions u, v, w, provided it makes sense to form all the composites in question. 
The reader will find that the proof of (3.17) is a straightforward exercise. 

It should be noted that the commutative law, u o v = v o u, does not always hold for 
composition. For example, if u(x) = sin x and v(x) = x 2 , the composite / = U o v is given 
by f(x) = sin x 2 (which means sin (x 2 )], whereas the composition g = V = U is given by 
g(x) = sin 2 x [which means (sin a) 2 ]. 

Now we shall prove a theorem which tells us that the property of continuity is preserved 
under the operation of composition. More precisely, we have the following. 

theorem 3.5. Assume v is continuous at p and that u is continuous at q, where q = v(p). 
Then the composite function f = u a v is continuous at p. 

Proof. Since u is continuous at q, for every neighborhood N^uty)] there is a neighborhood 
N.fq) such that 

(3.18) u{y) EN x [u(q)] whenever y 6 Nfq) . 

But q = v(p ) and v is continuous at p, so for the neighborhood N 2 {q) there is another 
neighborhood Nfp) such that 

(3.19) i)(x)e(V 2 (/) whenever x£(V 3 (p). 

If we let y := r(x) and combine (3.18) with (3.19), we find that for every neighborhood 
Ndulvip)]) there is a neighborhood N,(p) such that 

u[v{x)} 6 A' 1 (M[t’(/')]) whenever x e N,(p), 

or, in other words, since f(x) = u[v(x)], 

f(x)EN 1 [f(p)} whenever x £ Nf p ) . 

This means thatfis continuous at p, as asserted. 

example 1. Let f(x) = sin x 2 . This is the composition of two functions continuous 
everywhere so f is continuous everywhere. 

example 2. Let f(x) = V 1 — X 2 = t/[n(x)], where u(x) — Vx, v{x) s= 1 — x 2 . The 
function v is continuous everywhere but ll is continuous only for points x ^ 0. Hence /is 
continuous at those points x for which n(x) > 0, that is at all points satisfying x 2 < 1. 
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3.8 Exercises 


In Exercises 1 through 10. the functionsfandg are defined by the formulas given. Unless other- 
wise noted, the domains off and g consist of all real numbers. Let h(x) = f[g(x)] whenever g(x) 
lies in the domain off. In each case, describe the domain of h and give one or more formulas for 
determining h(x). 


l./(x) = x 2 -2x, 

g(x) = * + 1. 


2. /(x) = * + 1, 

g(x) = x 2 - 2x. 


3- j-(x) = V* if * > 0, 

g(x) = x 2 . 


4. f(x) = Vx if X > 0, 

g(x) = -X 2 . 


5. f(x) = x 2 , 

g(x) = V* if 

x > 0. 

6. f(x) = —x 2 . 

g(x) =Vx if 

x > 0. 

7. f(x) = sin x, 

g(x) = V* if 

x > 0. 

8. f(x) = Vx if x > 0, 

g(x) = sin x. 


9. f(X) = V* if x > 0, 

g(x) = x + Vx 

if X 

10. f(x) = V x + V x if x > 0, 

g(x) = X + Vx 

if X 


Calculate the limits in Exercises 1 1 through 20 and explain which limit theorems you are using 
in each case. 


11. lim 
at-»— 2 


x 3 + 8 
x 2 - 4 ’ 


16. lim 
x-*l 


sin (x 2 — 1) 
~l 


12. lim V 1 + \/jt 


17. lim x sin 
1 


x-*i 

!5->0 

. .. sin ftan r) 

13. lim , 

18. Jim 

0 sin t 

x->0 

, , , . sin (cos x) 
14. lim v 

19 lim 

*-*r/2 C0S x 

a)— ►O 

1 

c 

'<r> 

15. lim . 

20 lim 

t-w 1 - 77 

2-o 


a: 

COS 2x 


Vi + x - V\ - . 

X 

. - Vi — 4x 2 


21. Let f andg be two functions defined as follows: 

* + w 


f(x) = 


for all x , g(x) = 


for x <0 , 
for x > 0 . 


Find a formula (or formulas) for computing the composite function h(x) = f[g(x)]. For 
what values of x is h continuous? 

22. Solve Exercise 21 whep f and g are defined as follows: 


m 


if 

if 


w < 

i*i > 


g(x) = 12 - 


if |x| < 2, 

if 1*1 > 2. 


23. Solve Exercise 21 when h(x) = g [f(x)J. 


3.9 Bolzano’s theorem for continuous functions 

In the rest of this chapter we shall discuss certain special properties of continuous func- 
tions that are used quite frequently. Most of these properties appear obvious when inter- 
preted geometrically ; consequently many people are inclined to accept them as self-evident. 
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However, it is important to realize that these statements are no more self-evident than the 
definition of continuity itself, and therefore they require proof if they are to be used with 
any degree of generality. The proofs of most of these properties make use of the least-upper- 
bound axiom for the real number system. 

Bernard Bolzano (1781-1848), a Catholic priest who made many important contributions 
to mathematics in the first half of the 19th Century, was one of the first to recognize that 
many “obvious” statements about continuous functions require proof. His observations 
concerning continuity were published posthumously in 1850 in an important book, Para- 
dox/' en des U ner/dlichen. One of his results, now known as the theorem of Bolzano, is 
illustrated in Figure 3.6, where the graph of a continuous function f is shown. The graph 
lies below the x-axis at x = a and above the axis at x = b, Bolzano’s theorem asserts that 
the curve must cross the axis somewhere between a and b. This property, first published 
by Bolzano in 1817, may be stated formally as follows. 

theorem 3.6. bolzano' stheorem. Let f he continuous at each point of a closed interval 
[a, b] and assume that f(a) and f{b) have opposite signs. Then there is at least one c in the 
open interval (a, b) such that f (c) = 0. 

We shall base our proof of Bolzano’s theorem on the following property of continuous 
functions which we state here as a separate theorem. 

theorem 3.7. SIGN-PRESERVING property of CONTINUOUS functions . L etfbe COtX- 

tinuous at c and suppose that f(c) ^ 0. Then there is an interval (c - <5, c + <5) about c in 
which f has the same sign as f(c). 

Proof of Theorem 3.7. Suppose f(c) > 0. By continuity, for every e > 0 there is a 
3 > 0 such that 

(3.20) f(c) — e <f(x) <f(c) + e whenever c — (5 < x < c + ^ . 

If we take the 3 corresponding to e = f (c)/2 (this e is positive), then (3.20) becomes 
\f{c) < f(x) < | f(c) whenever c -i5<x<c -f ). 
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Figure 3.7 Here f(x) > 0 for x near c 
because /(c) > 0. 
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(See Figure 3.7). Therefore f ( X ) > 0 in this interval, and hence f ( X ) and /(c) have the 

same sign. If f(c) < 0, we take the <5 corresponding to e = ~\f(c) and arrive at the same 
conclusion. 

Note: If there is one-sided continuity at c, then there is a corresponding one-sided 

interval [c, c + 6) or (c — <5, c] in which / has the same sign as f(c). 

Proof of Bolzano’s theorem. To be specific, assume f ( d ) <0 and f(b) > 0. as shown 

in Figure 3.6. There may be many values of x between a and b for which fix ) = 0. Our 
problem is to find one. We shall do this by finding the largest x for which f(x) = 0. For 
this purpose we let S denote the set of all those points x in the interval [a, b] for which 
fix) ^ 0. There is at least one point in S because fia) < 0. Therefore S is a nonempty 
set. Also, S is bounded above since all of S lies within [a, b], so S has a supremum. Let 
c = sup S. We shall prove that fic) = 0. 

There are only three possibilities: fic) >0, fic) <0, and fic) = 0. If fic) > 0, there 
is an interval (c — <3, c + <5), or (c — d, c] if c = b, in which / is positive. Therefore no 
points of S can lie to the right of c — d, and hence c — <5 is an upper bound for the set S. 
But c — (5 < c, and c is the least upper bound of S. Therefore the inequality fic) > 0 
is impossible. If fic ) < 0, there is an interval (c — <5, c + d), or [c, c + (5) if c = a , in 
which / is negative. Hence fix) < 0 for some x > c, contradicting the fact that c is an 
upper bound for S. Therefore!’ (c) < 0 is also impossible, and the only remaining possibility 
is fic) = 0. Also, a < c < b because fia) < 0 and fib) > 0. This proves Bolzano’s 
theorem. 

3.10 The intermediate-value theorem for continuous functions 

An immediate consequence of Bolzano’s theorem is the intermediate-value theorem for 
continuous functions, illustrated in Figure 3.8. 

THEOREM 3.8. Let f be continuous at each point of a closed interval [a, b]. Choose two 
arbitrarypoints X x < X 2 in [a, b] such tliatf [x ft ^ f(xf). Then f takes on every value between 
f (x x ) and f(x 2 ) somewhere in the interval (x„ X 2 ). 

Proof. Suppose f(x y) <f(x 2 ) and let k be any value between f(xf) and / (x, ). Let g be the 
function defined on [x,, x 2 ] as follows: 

g(x)=f(x)-k. 



Figure 3.8 Illustrating the intermediate- 
value theorem. 


Figure 3.9 An example for which Bolzano’s 
theorem is not applicable. 
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Then g is continuous at each point of [jc l5 x 2 ], and we have 

g(* 1 ) = A x i) ~ k < 0 , g(x 2 ) =f(.X 2 ) - k > 0 , 

Applying Bolzano’s theorem to g, we have g(c) = 0 for some c between x x and X 2 . But 
this means f(c) = k, and the proof is complete. 

Note: In both Bolzano’s theorem and the intermediate-value theorem, it is assumed 

that!' is continuous at each point of [a, b], including the endpoints a and b. To understand 
why continuity at both endpoints is necessary, we refer to the curve in Figure 3.9. Here 
f i S continuous everywhere in [a, b] except at a. Although f(a) is negative and fib) is 
positive, there is no x in [a, b ] for whichf(x) = 0. 

We conclude this section with an application of the intermediate-value theorem in which 
we prove that every positive real number has a positive nth root, a fact mentioned earlier in 
Section 13.14. We state this as a formal theorem. 


THEOREM 3.9. If n is a positive integer and if a > 0, then there is exactly one positive 
b such that b n = a. 

Proof. Choose c > 1 such that 0 < a < c, and consider the function / defined on the 
interval [0, c] by the equation f (x) = x n . This function is continuous on [0, c], and at the 
endpoints we have f(0) = 0, f(c) = c”. Since 0 < a < c < c n , the given number a lies 
between the function values f(0) and f(c). Therefore, by the intermediate-value theorem, 
we have fix) = a for some x in (0, c), say for x = b. This proves the existence of at least 
One positive b such that b n = a. There cannot be more than one such b becausefis strictly 
increasing on [0, c]. This completes the proof. 


3.11 Exercises 

1 . Letf be a polynomial of degree n, say/Lc) = '%£ = fax li , such that the first and last coefficients 
c 0 and c n have opposite signs. Prove that f (x) = 0 for at least one positive x. 

2. A real number x x , such that f(x j) = 0, is said to be a real root of the equation f(x) = 0. We 
say that a real root of an equation has been isolated if we exhibit an interval [a, b ] containing 
this root and no others. With the aid of Bolzano’s theorem, isolate the real roots of each of 
the following equations (each has four real roots). 

(a) 3x 4 - Zx 3 - 36x 2 + 36x - 8 = 0. 

(b) 2x i - 14x 2 + 14x - 1 = 0. 

(c) x* + 4x 3 + x 2 — 6x + 2 = 0. 

3. If n is an odd positive integer and a < 0, prove that there is exactly one negative b such that 
b n = a. 

4. Let f(x) = tan x. Although /(w/4) = 1 and f{3irjA) = -1, there is no x in the interval 
[jt/4, 3 7t/ 4 ] such that f(x) = 0. Explain why this does not contradict Bolzano’s theorem. 

5. Given a real-valued tunction f which is continuous on the closed interval [0, 1], Assume that 

0 < f(x) < 1 for each X in [0, 1], Prove that there is at least one point c in [0, 1] for which 
f(c) = c. Such a point is called ajxedpoint off . The result of this exercise is a special case of 
Brouwer’s fixed-point theorem. [Hint: Apply Bolzano’s theorem to g(x) = f(x) — x.] 

6. Given a real-valued functionfwhich is continuous on the closed interval [a, b ]. Assume that 

f(u) <i a and that f(b) > h. Prove thatfhas a fixed point in [a, h /. (See Exercise 5.) 
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3.12 The process of inversion 

This section describes another important method that is often used to construct new 
functions from given ones. Before we describe the method in detail, we will illustrate it with 
a simple example. 

Consider the function / defined on the interval [0, 2] by the equation f(x ) = 2x +1. 
The range of /is the interval [I, 5]. Each point x in [0, 2] is carried byf onto exactly one 
point y in [1, 5], namely 

(3.21) y = 2x+ I. 

Conversely, for every y in [1, 5], there is exactly one x in [0, 2] for which y = f(x). To find 
this x, we solve Equation (3.21) to obtain 

x = Uy ~ ')■ 

This equation defines x as a function of y. If we denote this function by g, we have 

g(y) = l(y - 1 ) 


for each y in [E5], The function g is called the inverse off. Note that g[f{x)] = x for 
each x in [0, 2], and that f[g{y)} = y for each y in [1, 5], 

Consider now a more general functionf with domain A and range B. For each x in A, 
there is exactly one y in B such that V = f(x). For each y in B, there is at least one x in A 
such that f(x) = y. Suppose that there is exactly one such x. Then we can define a new 
function g on B as follows: 


giy) = x means y = fix) . 

In other words, the value of g at each point y in B is that unique x in A such that/(x) = y. 
This new function g is called the inverse of f. The process by which g is obtained fromfis 
called inversion. Note that g[f(x)} = x for all x in A, and that / [,i/ >')] = y for all y in B. 

The process of inversion can be applied to any function / having the property that for 
each y in the range off, there is exactly one X in the domain off such that f(x) = y. In 
particular, a function that is continuous and strictly monotonic on an interval [a, b ] has this 
property. An example is shown in Figure 3.10. Let c = f(a), d = fib). The intermediate- 
value theorem for continuous functions tells us that in the interval [a, b], f takes on every 
value between c and d. Moreover, /cannot take on the same value twice because /(x x ) ^ 
fix 2 ) whenever x 1 yk x 2 . Therefore, every continuous strictly monotonic function has an 
inverse. 

The relation between a function / and its inverse g can also be simply explained in the 
ordered-pair formulation of the function concept. In Section 1.3 we described a function 
/ as a set of ordered pairs (x, y) no two of which have the same first element. The inverse 
function g is formed by taking the pairs (x, y) inf and interchanging the elements x and y. 
That is, (y, x) 6 g if and only if (x, y) ef If / is strictly monotonic, then no two pairs in / 
have the same second element, and hence no two pairs of g have the same first element. 
Thus g is, indeed, a function. 
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example. The nth-root function. If n is a positive integer, let f(x) =; x n for x > 0. 
Then /is strictly increasing on every interval [a, b] with 0 < a < b. The inverse function g 
is the nth-root function, defined for y > 0 by the equation 

g(y) = f /n . 

3.13 Properties of functions preserved by inversion 

Many properties possessed by the function / are transmitted to the inverse g. Figure 
3.11 illustrates the relationship between their graphs. One can be obtained from the other 
merely by reflection through the line y = x, because a point ( u , v) lies on the graph off 
if and only if the point (v, u) lies on the graph of g. 




Figure 3.10 A continuous, strictly increasing Figure 3.11 Illustrating the process of 

function. inversion. 

The properties of monotonicity and continuity possessed by f are transmitted to the 
inverse function g, as described by the following theorem. 

theorem 3.10. Assume f is strictly increasing and continuous on an interval [a. b]. Let 
c =f(a) and d =f(b) and let g be the inverse off. That is, for euchy in [c, d], let g(y) be that 
x in [a, b] such that y = f(x). Then 

(a) g is strictly increasing on [c, d]\ 

(b) g is continuous on [c, d]. 

Proof. Choose J/ < y, in [c, d] and let X 1 = g{yf), X 2 = g(y 2 ). Then y x — f{xf) and 
y 2 = f(x 2 ). Since /is strictly increasing, the relation y 1 < y 2 implies x 1 < x 2 , which, in 
turn, implies g is strictly increasing on [c, d]. This proves part (a). 

now we prove (b). The proof is illustrated in Figure 3.12. Choose a point y„ in the open 
interval (c, d). To prove g is continuous at y 0 , we must show that for every e > 0 there is 
a (5 > 0 such that 

(3.22) g(yo) - C < g(y) < g(y 0 ) + e whenever y 0 - 8 < y < y„ + 6. 

Let x 0 = g(/o)> so that / ( x 0 ) = y„. Suppose € is given. (There is no loss in generality if we 
consider only those e small enough so that both .v 0 — f and x 0 + e are in [a, b].) Let d 
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be the smaller of the two numbers 

/(*o) - fix 0 - <0 and f(x 0 + C - /(x 0 ) , 

It is easy to check that this d works in (3.22). A slight modification of the argument proves 
that g is continuous from the right at c, and continuous front the left at d. 

There is a corresponding theorem for decreasing functions. That is, the inverse of a 
strictly decreasing continuous functionfis strictly decreasing and continuous. This follows 
by applying Theorem 3.10 to —f. 



Figure 3.12 Proof of the continuity of the inverse function. 

example . Continuity of the nth-root function. The nth-root function g, defined for 
y > 0 by the equation g(y) = y 1 /", is strictly increasing and continuous on every interval 
[c, d] with 0 < c < d, since it is the inverse of a strictly increasing continuous function. 
This gives an alternate proof of the continuity of the nth-root function, independent of the 
theory of integration. Since the product of continuous functions is continuous, we again 
deduce the continuity of the rth-power function, h(y) = y’, where r = mjn is a positive 
rational number and y > 0. 

3.14 Inverses of piecewise monotonic functions 

Suppose we try to apply the process of inversion to a function that is not monotonic on 
[u, b]. For example, suppose that/(x) = x 2 on an interval of the form [ — c, c] on the x-axis. 
Each point x in this interval is carried by f into exacdy one point y in the interval [0, c 2 ], 
namely, 

(S.23) y = x 2 . 

We can solve Equation (3.23) for x in terms of y, but there are two values of x corresponding 
to each y in (0, c 2 ], namely, 


X = s/y and 


x = — Vy 




Exercises 


149 


As we have mentioned once before, there was a time when mathematicians would have said 
that the inverse g in this case is a double-valued function defined by 

£(>’) = ±Vy. 

But since the more modern point of view does not admit double-valuedness as a property 
of functions, in a case like this we say that the process of inversion gives rise to two new 
functions, say g 1 and g 2 , where 

(3.24) gl (y) = Vy and gziy) = — Vj for each y in [0, c 2 ] , 

To fit this in with the notion of inverse as explained above, we can look upon the equation 
y = x 2 as defining not one function f but two functions f and f 2 , say, where 

fix') = x 2 if 0 < x < c and ffx) = x 2 if — C < x < 0 . 

These may be considered as distinct functions because they have different domains. Each 
function is monotonic on its domain and each has an inverse, the inverse of f being g 1 
and the inverse off, being g 2 , where gl and g 2 are given by (3.24). 

This illustrates how the process of inversion can be applied to piecewise monotonic 
functions. We simply consider such a function as a union of monotonic functions and invert 
each monotonic piece. 

We shall make extensive use of the process of inversion in Chapter 6. 

3.15 Exercises 

In each of Exercises 1 through 5, show thatfis strictly monotonic on the whole real axis. Letg 
denote the inverse off. Describe the domain of g in each case. Write y = fix) and solve for x 
in terms of y; thus find a formula (or formulas) for computing g(y) for each y in the domain of g. 

1 . f(x) =x + 1. 4. fix) = a- 3 . 

2. f{x) = 2x + 5. /x i f x < 1 , 

3. f(X) = 1 - x. 5. fix) = lx 2 if 1 < x < 4, 

(8x-4 if x > 4. 

Mean values. Let / be continuous and strictly monotonic on the positive real axis and let g 
denote the inverse of f. If a, < a 2 < < a, are n given positive real numbers, we define 
their mean value (or average) with respect to f to be the number M f defined as follows: 

In particular, when f(x) = x v for p ^ 0, M f is called the pth power mean (See also Section 
1 4.10.) The exercises which follow deal with properties of mean values. 

6. Show that f(M f ) = (1 /«) I 11 °lh er words, the value off at the average Mf is the 

arithmetic mean of the function values fia x ), , . . ,fia n ). 

7. Show that a, < M f < a,. In other words, the average of a,, ... , a, lies between the largest 
and smallest of the a x . 

8. If h(x) = afix) + b, where a ^ 0, show that M h = Mf . This shows that different functions 
may lead to the same average. Interpret this theorem geometrically by comparing the graphs 
of h and /. 
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3.16 The extreme-value theorem for continuous functions 

Letfbe a real-valued function defined on a set S of real numbers. The function f is said 
to have an absolute maximum on the set S if there is at least one point c in S such that 

fix) < /(c) for all x in S . 

The number f(c) is called the absolute maximum value off on S. We say that / has an 
absolute minimum on S if there is a point d in S such that 

f(x) > f(d) for all x in S . 


y No absolute 

maximum exists 



f(x) = sin x, 0 < x < x f(x) = — ifO < x < 2, f(0) = 1 

x 

(a) (b) 


figure 3.13 Maximum and minimum values of functions. 

These concepts are illustrated in Figure 3.13. In Figure 3.13(a), S is the closed interval 
[0, 77-] and f(x) =sin x. The absolute minimum, which occurs at both endpoints of the 
interval, is 0. The absolute maximum is/(j77) = 1. 

In Figure 3.13(b), S is the closed interval [0, 2] and f(x) = 1/x if x > 0,/(0) = 1. In 
this example, f has an absolute minimum at x = 2, but it has no absolute maximum. It 
fails to have a maximum because of a discontinuity at a point of S. 

We wish to prove that if S is a closed interval and if /is continuous everywhere on S, then 
fhas both an absolute maximum and an absolute minimum on S. This result, known as 
the extreme-value theorem for continuous functions, will be deduced as a simple consequence 
of the following theorem. 

THEOREM 3.11. BOUNDEDNESS THEOREM FOR CONTINUOUS FUNCTIONS. Let f be Con- 
tinuous on a closed interval [a, b]. Then f is bounded on [a, b]. That is, there is a number 
C ^ 0 such that f (x)|< C for all x in [a, b]. 
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Proof. I/Ve argue by contradiction, using a technique called the method of successive 
bisection. Assume that / is unbounded (not bounded) on [a, b]. Let c be the midpoint of 
[a, b]. Since / is unbounded on [a, b] it is unbounded on at least one of the subintervals 
[a, c] or [c, b]. Let [a, , b,] be that half of [a, b] in which /is unbounded. Iff is unbounded 
in both halves, let [a, , bf\ be the left half, [a, c]. Now continue the bisection process 
repeatedly, denoting by [a,,, that half of [a,, b,] in which /is unbounded, with the 

understanding that we choose the left half iff is unbounded in both halves. Since the length 
of each interval is half that of its predecessor, we note that the length of [a , , b,] is (b — a)/ 2”. 

Let A denote the set of leftmost endpoints a , a, , a 2 , • ■ ■ , so constructed, and let a be the 
supremum of A. Then a lies in [a, b]. By continuity off at a, there is an interval of the 
form (a — 8, a + <5) in which 

(3.25) jf (x j -/( a) | < 1 . 

If a = a this interval has the form [a, a + 6), and if a = b it has the form (b — 6, b]. 
Inequality (3.25) implies 

I/Ml < 1 + If (a) | , 

so fis bounded by 1 + |/(a)| in this interval. However, the interval [a, , b,] lies inside 
(a — d, a + d) when n is so large that (b — d)j 2 n < <5. Therefore / is also bounded in 
[a, , b, ], contradicting the fact that /is unbounded on [a. , b, ]. This contradiction completes 
the proof. 

If /is bounded on [a. b], then the set of all function values /(x) is bounded above and 
below. Therefore, this set has a supremum and an infimum which we denote by sup / and 
inff, respectively. That is, we write 

sup/ = sup {fix) a<x<b}, inf / = inf {fix) \a<x<b) . 

For any bounded function we have inf f f fix) < sup / for all x in [a, b]. Now tveprove 
that a continuous function takes on both values inff and sup / somewhere in [a, b], 

THEOREM 3.12. EXTREME-VALUE THEOREM FOR CONTINUOUS FUNCTIONS. ASSUme f is 

continuous on a closed interval H a r b]. Then there exist points c and d in [a, b] such that 

f(c) = sup / and f(d) = inf/. 

Proof. It suffices to prove thatf attains its supremum in [a, b]. The result for the infimum 
then follows as a consequence because the infimum off is the supremum of —f. 

Let M = sup f. We shall assume that there is no x in [a, b] for which fix ) = A4 and 
obtain a contradiction. Let g(x) = M -f(x). Then g(x) > 0 for all x in [a, b] SO the 
reciprocal 1/g is continuous on [a, b]. By Theorem 3.11, 1 jg is bounded on [a, b], say 1 fg(x) 

< C for all x in [a, b], where C > 0. This implies M -fix) > 1/C, so that f(x) < A4 — 

1 1C for all x in [a, b]. This contradicts the fact that M is the least upper bound off on 
[a, b]. Hence, fix) = M for at least one x in [a, b]. 
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Note: This theorem shows that if / is continuous on [a, b], then sup / is its absolute 

maximum, and inf fits absolute minimum. Hence, by the intermediate-value theorem, the 
range of f is the closed interval [inf / sup /]. 

3.17 The small-span theorem for continuous functions (uniform continuity) 

Let f be real-valued and continuous on a closed interval [a, h] and let M(f) and m(f) 
denote, respectively, the maximum and minimum values off on [a, b\. We shall call the 
difference 

M(f) - m(f ) 

the span of fin the interval [a, b]. Some authors use the term oscillation instead of span. 
However, oscillation has the disadvantage of suggesting undulating or wavelike functions. 
Older texts use the word saltus, which is Latin for leap. The word “span” seems more 
suggestive of what is being measured here. We note that the span of/in any subinterval 
of [a, b\ cannot exceed the span of /in [a, b]. 

We shall prove next that the interval [a, b] can be partitioned so that the span off in each 
subinterval is arbitrarily small. More precisely, we have the following theorem which we 
call the small-span theorem for continuous functions. It is usually referred to in the literature 
as the theorem on uniform continuity. 

theorem 3.13. Let f be continuous on a closed interval [ a , b]. Then, for every e> 0 
there is a partition of [a, b] into a finite number of subintervals such that the span off in every 
subinterval is less than e. 

Proof. We argue by contradiction, using the method of successive bisections. Assume 
the theorem is false. That is, assume that for some e, say for e = e 0 , the interval [a, b] 
cannot be partitioned into a finite number of subintervals in each of which the span off 
is less than e 0 . Let c be the midpoint of [a, b ]. Then for the same e 0 , the theorem is false in 
at least one of the two subintervals [a, c] or [c, b ]. (If the theorem were true in both intervals 
[a, c] and [c, b], it would also be true in the full interval [a, b].) Let [a, , b, ] be that half of 
[a, b] in which the theorem is false for e 0 . If it is false in both halves, let [a, , b,] be the left 
half, [a, c]. Now continue the bisection process repeatedly, denoting by [a„, , b ll+1 ] that 
half of [a, , b,] in which the theorem is false for e 0 , with the understanding that we choose 
the left half if the theorem is false in both halves of [a, , b„]. Note that the span off in each 
subinterval [a, , b,] so constructed is at least e 0 . 

Let A denote the collection of leftmost endpoints a, a, , a 2 , • • • , SO constructed, and let 
a be the least upper bound of A. Then a lies in [a, b]. By continuity off at a, there is an 
interval {% ~ b, <x + <)) in which the span off is less than . (If a = a, this interval is 
[a, a -f b), and if a = b, it is (b — <5, b}.) However, the interval [a, , b, ] lies inside (a — b, 
x + S) when n is so large that (b — a)/2 n < <3, so the span off in [a, , b, ] is also less than 
e 0 , contradicting the fact that the span off is at least e 0 in [a, , b, /. This contradiction 
completes the proof of Theorem 3.13. 

3.18 The integrability theorem for continuous functions 

The small-span theorem can be used to prove that a function which is continuous on 
[a, b] is also integrable on [ a, b ]. 
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THEOREM 3.14. INTEGRABILITY OF CONTINUOUS FUNCTIONS. If a function f is continuous 
at each point of a closed interval [a, b], then f is integrable on [a, b]. 

Proof. Theorem 3.11 shows that f is bounded on [a, b], so f has an upper integral, 
1(f), and a lower integral, J(f). We shall prove that 1(f) = 1(f). 

Choose an integer N > 1 and let e = I//V, By the small-span theorem, for this choice 
of e there is a partition P = {.\' 0 , A j , . . . , x n } of [a, b] into n subintervals such that the span 
of f in every subinterval is less than e. Denote by M k ( f) and m k ( f ), respectively, the absolute 
maximum and minimum values of f in the kth subinterval [,v A . , , .Yj,]. Then we have 

M k (f) - m k (f) < € 

for each k = 1,2 Now let s,. t and t„ be two step functions defined on [a, b] as 
follows : 

s„(*) = m ff) if < x < X k , = niff ), 

t„(x) = M k (f) if <; x < x k , t n (b) = M n (f). 

Then we have s,(x) </(x) K t,(x) for all x in [a, b]. Also, we have 

(b n (b n 

) S H = 2 m k (f)(x k - x k _y) and | t n =X M k(f)( x ic ~ x k _f) . 

' 0 *=1 ' k=l 

The difference of these two integrals is 


Cb fb n n 

L tn ~ L Sn = - m k(f)](x k - X k _f) < e^(x k - x k _i) = e(b - a) 


Since € = 1 /N, this inequality can be written in the form 

b — a 


(3.26) 


a 


C - S n < 


N 


On the other hand, the upper and lower integrals offsatisfy the inequalities 

P s„ < 1(f) < P, and P s n < 1(f) < \ " t n . 

Multiplying the first set of inequalities by (-1) and adding the result to the second set, 
we obtain 

Kf) ~ 1(f) < .f t n - £ s n . 

Using (3.26) and the relation 1(f )) < /(/), we have 


o < 1(f) - Kf) < 


b - a 


N 



154 


Continuous junctions 


for every integer N > 1. Therefore, by Theorem 1 .3 f , we must have 1(f) = 1(f). This 
proves thatf‘is integrable on [a, b]. 


3.19 Mean-value theorems for integrals of continuous functions 

In Section 2.16 we defined the average value A(f) of a function /over an interval [a, b ] 
to be the quotient ff(x) dxl(b — a). Whenfis continuous, we can prove that this average 
value is equal to the value of f at some point in [a, b]. 


THEOREM 3.15. MEAN-VALUE THEOREM FOR INTEGRAIS . If f is COlltinUOUS Oil [a, b ], 

then for some c in [a. b] we have 


£/(*) dx =f(c)(b - a). 

Proof. Let m and M denote, respectively, the minimum and maximum values off on 
[a, b]. Then m <f(x) < M for all x in [a, b]. Integrating these inequalities and dividing 
by b — a, we find m < A(f) < M, where A(f) = f a f (x) dx/(b — a). But now the inter- 
mediate-value theorem tells us that A(f) = f(c) for some c in [a, b]. This completes the 
proof. 

There is a corresponding result for weighted mean values. 


THEOREM 3.16. WEIGHTED MEAN-VALUE THEOREM FOR INTEGRAIS. A S S Ullief Olid g Ore 

continuous on [a, b]. If g never changes sign in [a, b] then, for some c in [a, b], we have 
(3.27) f b f(x)g(x) dx = f(c) p g(x) dx . 

J a d a 

Proof. Since g never changes sign in [a, b], gis always nonnegative or always nonpositive 
on [a, b]. Let us assume that g is nonnegative on [a, b]. Then we may argue as in 
the proof of Theorem 3.15, except that we integrate the inequalities mg(x) < f(x)g(x) < 
Mg(x) to obtain 

(3-28) dx < f (x)g(x) dx < dx. 

If J„g(x) dx = 0, this inequality shows that f (.v)g(x) dx = 0. In this case. Equation (3.27) 
holds trivially for any choice of c since both members are zero. Otherwise, the integral of g 
is positive, and we may divide by this integral in (3.28) and apply the intermediate-value 
theorem as before to complete the proof. If g is nonpositive, we apply the same argument 
to -g. 

The weighted mean-value theorem sometimes leads to a useful estimate for the integral 
of a product of two functions, especially if the integral of one of the factors is easy to 
compute. Examples are given in the next set of exercises. 
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3.20 Exercises 

1. Use Theorem 3.16 to establish the following inequalities: 


1 

10V2 


Vl + x 


1 

dx < — . 

10 


2. Note that V 1 — X 2 = (1 " X 2 )/V 1 — X 2 and use Theorem 3.16 to obtain the inequalities 


11 f 1 / 2 

s s J„ 


11 


\/ | _ Y 2 dx < — 

V S 24 V 3 


3. Use the identity 1 + X s = (1 + x 2 )(l — x 2 + x 4 ) and Theorem 3.16 to prove that for a > 0, 
we have 

1 / a 3 a 5 \ f° dx a 3 a 5 

fT7« “ I + T j - j 0 - fl “I + ?' 

Take a = 1/10 and calculate the value of the integral rounded off to six decimal places. 

4. One of the following two statements is incorrect. Explain why it is wrong. 

(a) The integral ( sin t)/t dt > 0 because J 2 * (sin t)jt dt > | sin t\/t dt. 

(b) The integral (sin t)lt dt = 0 because, by Theorem 3.16, for some c between 277 and 477 


we have 


f 

i' 4 " ssin t 1 

dt = - 

1 c 


4ff . COS (277) -COS (477) „ 

sin t dt = = 0. 

Sit c 


5. If n is a positive integer, use Theorem 3. 16 to show that 


1, 


vfs+n; . , 2W (-])" 

sm (t 2 ) dt = — 


where 


< C < \/(« + 1 )t7 . 


6. Assume / is continuous on [a, b]. If j® /(x) dx = 0, prove that /(c) = 0 for at least one C in 

la, bl. 

7. Assume thatfis integrable and nonnegative on [a, b]. If J h a f{x) dx = 0, prove that f(x) = 0 

at each point of continuity off. [Hint: If /(c) > 0 at a point of continuity c, there is an 

interval about c in which f(x) > 

8. Assume fls continuous on [a, b]. Assume also that § b a f{x)g{x) dx = 0 for every function g 
that is continuous on [a, b]. Prove that f{x) = 0 for all x in [a. b]. 
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DIFFERENTIAL CALCULUS 


4.1 Historical introduction 

Newton and Leibniz, quite independently of one another, were largely responsible for 
developing the ideas of integral calculus to the point where hitherto insurmountable problems 
could be solved by more or less routine methods. The successful accomplishments of these 
men were primarily due to the fact that they were able to fuse together the integral calculus 
with the second main branch of calculus, differential calculus. 

The central idea of differential calculus is the notion of derivative. Like the integral, 
the derivative originated from a problem in geometry-the problem of finding the tangent 
line at a point of a curve. Unlike the integral, however, the derivative evolved very late 
in the history of mathematics. The concept was not formulated until early in the 17th 
Century when the French mathematician Pierre de Fermat, attempted to determine the 
maxima and minima of certain special functions. 

Fermat’s idea, basically very simple, can be understood if we refer to the curve in 
Figure 4.1. It is assumed that at each of its points this curve has a definite direction that 
can be described by a tangent line. Some of these tangents are indicated by broken lines 
in the figure. Fermat noticed that at certain points where the curve has a maximum or 



Figure 4.1 The curve has horizontal tangents above the points and . 
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minimum, such as those shown in the figure with abscissae x 0 and x 1 , the tangent line 
must be horizontal. Thus the problem of locating such extreme values is seen to depend 
on the solution of another problem, that of locating the horizontal tangents. 

This raises the more general question of determining the direction of the tangent line 
at an arbitrary point of the curve. It was the attempt to solve this general problem that 
led Fermat to discover some of the rudimentary ideas underlying the notion of derivative. 

At first sight there seems to be no connection whatever between the problem of finding 
the area of a region lying under a curve and the problem of finding the tangent line at 
a point of a curve. The first person to realize that these two seemingly remote ideas are, 
in fact, rather intimately related appears to have been Newton’s teacher, Isaac Barrow 
(1630-1677). However, Newton and Leibniz were the first to understand the real impor- 
tance of this relation and they exploited it to the fullest, thus inaugurating an unprece- 
dented era in the development of mathematics. 

Although the derivative was originally formulated to study the problem of tangents, it 
was soon found that it also provides a way to calculate velocity and, more generally, the 
rate of change of a function. In the next section we shall consider a special problem in- 
volving the calculation of a velocity. The solution of this problem contains all the essential 
features of the derivative concept and may help to motivate the general definition of 
derivative which is given in Section 4.3. 


4.2 A problem involving velocity 

Suppose a projectile is fired straight up from the ground with initial velocity of 144 feet 
per second. Neglect friction, and assume the projectile is influenced only by gravity so 
that it moves up and back along a straight line. Let f(t ) denote the height in feet that the 
projectile attains t seconds after firing. If the force of gravity were not acting on it, the 
projectile would continue to move upward with a constant velocity, traveling a distance 
of 144 feet every second, and at time t we would have f(t) = 144t. In actual practice, 
gravity causes the projectile to slow down until its velocity decreases to zero and then it 
drops back to earth. Physical experiments suggest that as long as the projectile is aloft, 
its height/(t) is given by the formula 

(4.1) fit) = 144t — 16f 2 . 

The term — 1 6f 2 is due to the influence of gravity. Note that f(t) = 0 when t = 0 and 
when t = 9. This means that the projectile returns to earth after 9 seconds and it is to 
be understood that formula (4.1) is valid only for 0 < t < 9. 

The problem we wish to consider is this: To determine the velocity of the projectile at 
each instant of its motion. Before we can understand this problem, we must decide on 
what is meant by the velocity at each instant. To do this, we introduce first the notion 
of average velocity during a time interval, say from time t to time t + h. This is defined 
to be the quotient 

change in distance during time interval _ f(t + h) — f(t) 
length of time interval h 

This quotient, called a difference quotient, is a number which may be calculated whenever 
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both t and t + h are in the interval [0, 9], The number h may be positive or negative, 
but not zero. We shall keep t fixed and see what happens to the difference quotient as 
we take values of h with smaller and smaller absolute value. 

For example, consider the instant t = 2. The distance traveled after 2 seconds is 


f(2) = 288 - 64 = 224. 


At time t = 2 + h, the distance covered is 

/( 2 + h) = 144(2 + h) - 16(2 + hf = 224 + 80/t - 16 h 2 . 

Therefore the average velocity in the interval from t = 2 to / = 2 + h is 

/( 2 + h) - /( 2) _ 80/i — 16/t 2 _ g() _ m 

h h 

As we take values of h with smaller and smaller absolute value, this average velocity gets 
closer and closer to 80. For example, if h = 0.1, we get an average velocity of 78.4; when 
h = 0.001, we get 79.984; when h = 0.00001, we obtain the value 79.99984; and when 
h = -0.00001, we obtain 80.00016. The important thing is that we can make the average 
velocity as close to 80 as we please by taking \h\ sufficiently small. In other words, the 
average velocity approaches 80 as a limit when h approaches zero. It seems natural to call 
this limiting value the instantaneous velocity at time t = 2. 

The same kind of calculation can be carried OUt for any other instant. The average 
velocity for an arbitrary time interval from / to t + h is given by the quotient 


f(t + h) -f(t ) _ [144 (t + h) - 16Q + hf] - [144? - 16t 2 l 


= 144 - 32t - 16/j 


When h approaches zero, the expression on the right approaches 144 — 32t as a limit, 
and this limit is defined to be the instantaneous velocity at time t. If we denote the in- 
stantaneous velocity by v(t), we may write 

(4.2) v(t) = 144 - 32 1 . 

The formula in (4.1) for the distance f(t ) defines a function / which tells us how high 
the projectile is at each instant of its motion. We may refer to / as the position function. 
Its domain is the closed interval [0, 9] and its graph is shown in Figure 4.2(a). [The scale 
on the vertical axis is distorted in both Figures 4.2(a) and (b).] The formula in (4.2) for 
the velocity v(t) defines a new function v which tells us how fast the projectile is moving 
at each instant of its motion. This is called the velocity function, and its graph is shown in 
Figure 4.2(b). As t increases from 0 to 9, v(t) decreases steadily from t>(0) = 144 to v(9) = 
— 144. To find the time / for which v(t) = 0, we solve the equation 144 = 32t to obtain 
t = 9/2. Therefore, at the midpoint of the motion the influence of gravity reduces the 
velocity to zero, and the projectile is momentarily at rest. The height at this instant 
is /(9/2) = 324. When t > 9/2, the velocity is negative, indicating that the height is 
decreasing. 
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The limit process by which v(t) is obtained from the difference quotient is written sym- 
bolically as follows : 


(4.3) 


v(t) — lim 

7t->0 


h 


This equation is used to define velocity not only for this particular example but, more 
generally, for any particle moving along a straight line, provided the position function f 
is such that the difference quotient tends to a definite limit as h approaches zero. 



(a) 



Figure 4.2 (a) Graph of the position function f(t) = 144t — 16 1 2 . (b) Graph of the 
velocity function: v(t) = 144 — 32t. 


4.3 The derivative of a function 

The example described in the foregoing section points the way to the introduction of 
the concept of derivative. We begin with a function / defined at least on some open 
interval (a, b) on the x-axis. Then we choose a fixed point x in this interval and introduce 
the difference quotient 

f jx + h) - fix) 


where the number h , which may be positive or negative (but not zero), is such that x + h 
also lies in (a, b). The numerator of this quotient measures the change in the function 
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when x changes from x to x -f - h. The quotient itself is referred to as the average rate of 
change off in the interval joining x to x + h. 

Now we let h approach zero and see what happens to this quotient. If the quotient 

approaches some definite value as a limit (which implies that the limit is the same whether 

h approaches zero through positive values or through negative values), then this limit is 
called the derivative off at x and is denoted by the symbol f ‘(x) (read as “f prime of x”). 
Thus, the formal definition off (x) may be stated as follows : 

definition of derivative. The derivative f ‘(x) is defined by the equation 

(4.4, /W =| im /ij +*>-/« , 

A->0 h 

yrovided the limit exists. The number f '(x) is also called the rate of change off at x 


By comparing (4.4) with (4.3), we see that the concept of instantaneous velocity is 
merely an example of the concept of derivative. The velocity v(t) is equal to the derivative 
f(t), where / is the function which measures position. This is often described by saying 
that velocity is the rate of change of position with respect to time. In the example worked 
out in Section 4.2, the position function / is described by the equation 

f(t) = 144t - 16r 2 , 

and its derivative f is a new function (velocity) given by 

f(t) = 144 - 32t. 

In general, the limit process which produces / ‘(x) from / (x) gives us a way of obtaining 
a new function f from a given function / The process is called differentiation, and ff is 
called the first derivative off. Iff', in turn, is defined on an open interval, we can try to 
compute its first derivative, denoted by/" and called the second derivative off. Similarly, 
the nth derivative off, denoted by / (n) , is defined to be the first derivative off ( "- 1) , We 
make the convention that / (0) = f that is, the zeroth derivative is the function itself. 

For rectilinear motion, the first derivative of velocity (second derivative of position) is 
called acceleration. For example, to compute the acceleration in the example of Section 
4.2, we can use Equation (4.2) to form the difference quotient 


[144 - 32(t + h) 


[144 - 32fl 


Since this quotient has the constant value -32 for each h ^ 0, its limit as h — > 0 is also 
-32. Thus, the acceleration in this problem is constant and equal to -32.. This result 
tells us that the velocity is decreasing at the rate of 32 feet per second every second. In 9 
seconds the total decrease in velocity is 9 > 32 = 288 feet per second. This agrees with the 
fact that during the 9 seconds of motion the velocity changes from p(0) = 144 to 
v(9) = - 144. 
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4.4 Examples of derivatives 

EXAMPLE 1. Derivative of a constant function. Suppose / is a constant function, say 
f(x) = c for all x. The difference quotient is 

f(x + h) - fix) = c - c Q 
h h 

Since the quotient is 0 for all h 0, its limit, f ‘(x), is also 0 for every x. In other words, a 
constant function has a zero derivative everywhere. 

example 2. Derivative of a linear function. Suppose / is a linear function, say f(x) = 
mx + b for all real x. If h 9= 0, we have 

fix + h) — fix) _ m(x + h) + b — (mx + b) _ mh __ 

h ~ h h 

Since the difference quotient does not change when h approaches 0, we conclude that 

f’(x) = m for every x. 

Thus, the derivative of a linear function is a constant function. 

example 3. Derivative of a positive integer power function. Consider next the case 
f(x) = x n , where n is a positive integer. The difference quotient becomes 

f{x + h ) -fix) _ (x + hr - X" 
h h 

To study this quotient as /; approaches 0, we can proceed in two ways, either by factoring 
the numerator as a difference of two nth powers or by using the binomial theorem to 
expand (x + h)“. We shall carry out the details by the first method and leave the other 
method as an exercise for the reader. (See Exercise 39 in Section 4.6.) 

From elementary algebra we have the identity! 

ft — 1 

a n ~b n = ( a - b) 2 a**,"- 1 -* , 

k=0 

If we take a = x + h and b = X and divide both sides by /;, this identity becomes 

(x + h) 
h 

+ This identity is an immediate consequence of the telescoping property of finite sums. In fact, if we multiply 
each term of the sum by (a — b), we find 

(a - b) 2 a*/)" -1 -* = 2 (a* + W^* +11 - a k b r '- k ) = a n - b n . 

Jc=o » 
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There are n terms in the sum. As h approaches 0, (x + h) k approaches x k , the kth term 
approaches x k X n ~ 1 ~ = X n and therefore the sum of all n terms approaches fix n ~ 1 . 
From this it follows that 

f’(x) = nx n ~ l for every x. 

example 4. Derivative of the sine function. Let s(x) = sin x. The difference quotient 
in question is 


s(x + h) — s(x) __ sin (x + h) — sin x 

h “ h 

To transform this into a form that makes it possible to calculate the limit as h — ► 0, we use 
the trigonometric identity 


, . v - x 

sin y — sin x = 2 sin : cos 

2 2 


1 ±x 


with y = x -| - h. This leads to the formula 

sin (x + h) — sin x sin (/i/2) 


h 


hi 2 


cos 


(* + i) ■ 


As h —v 0, the factor cos (x + \h) — > cos x because of the continuity of the cosine. Also, 
the limit formula 


sin x 

lim = 1 


*-*0 x 

established earlier in Section 3.4, shows that 


(4.5) sin_W2) a s h ^ 0 

h/ 2 

Therefore the difference quotient has the limit cos x as h — > 0. In other words, s’(x) = 
cos x for every x; the derivative of the sine function is the cosine function. 

example 5. The derivative of the cosine function. Let c(x) = cos x. We shall prove that 
c’(x) = -sin x; that is, the derivative of the cosine function is minus the sine function. 
We start with the identity 

_ . v - x . y + x 

cos y — cos x = -2 sin : sin 

2 2 


and take y = x + h. This leads to the formula 


cos (x + h) — cos x 
h 


sin (h/ 2) 

/i/2 


sin 
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Continuity of the sine shows that sin (x + |/t) — > sin x as h — >■ 0; from (4.5), we obtain 

c’(x) = —sin x. 


EXAMPLE 6. Derivative of the nth-root function. If n is a positive integer, let fix) = x l l n 
for x > 0. The difference quotient forf is 

f(x + h) - fix) _ (x + h) lln — x Vn 
h ~ h 

Let u = (x + h) x l n and let v = x 1 /". Then we have u n = x + h and v n = x, so h = 
u n — v n , and the difference quotient becomes 


fjx + h ) —fjx) _ u - v _ 1 

h u n ~v n u nl + u n ~ 2 v + ■ ■ ■ + uv n ~ 2 + V n ' 1 ' 


The continuity of the nth-root function shows that U — > v as h — ► 0. Therefore each term 
in the denominator on the right has the limit v n ~ l as h — > 0. There are n terms altogether, 
so the difference quotient has the limit v x ~ n ln. Since v = x l ! n , this proves that 



l/n-i . 


exam ple 7. Continuity of functions having derivatives. If a function /has a derivative at 
a point x, then it is also continuous at x. To prove this, we use the identity 

fix + h) = fix) + h 

which is valid for h ^ 0. If we let h -> 0, the difference quotient on the right approaches 
f'(x) and, since this quotient is multiplied by a factor which tends to 0, the second term on 
the right approaches 0 -f’(x) = 0. This shows that fix + h) — >/(.v) as h — > 0, and hence 
that / is continuous at x. 


This example provides a new way of showing that functions are continuous. Every 
time we establish the existence of a derivative f(x), we also establish, at the same time, 
the continuity of/ at x. It should be noted, however, that the converse is not true. Con- 
tinuity at x does not necessarily mean that the derivative f(x) exists. For example, when 
f ( x ) = M> toin> tc = 0 is a point of continuity off [since f(x) — > 0 as x — > 0] but there 
is no derivative at 0. (See Figure 4.3.) The difference quotient [f(0 + h) — /(0)]//i is 



Figure 4.3 The function is continuous at 0 but f’(O) does not exist. 
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equal to \h\jh. This has the value + 1 if h > 0 and — 1 if h < 0, and hence does not tend 
to a limit as h — »• 0. 


4.5 The algebra of derivatives 

Just as the limit theorems of Section 3.4 tell us how to compute limits of the sum, differ- 
ence, product, and quotient of two functions, so the next theorem provides us with a 
corresponding set of rules for computing derivatives. 

theorem 4.1. Let f and g be two functions defined on a common interval. At each point 
where f and g have a derivative, the same is true of the sum f -|- g, the difference f — g, 
the product /' g, and the quotient fig, (For fig we need the extra proviso that g is not zero at 
the point in question.) The derivatives of these functions are given by the following formulas: 


0) (f + g)’ = f'+g ' ■ 

® (f-g)’ =/’ - g’ , 

® (f-g)' =f-g' +g-f, 

lf\ g- f — f. o' 

(iv) I — I = at points x where g(x) ^ 0 , 

V g 

We shall prove this theorem in a moment, but first we want to mention some of its 
consequences. A special case of (iii) occurs when one of the two functions is constant, 
say g(x) = c for all X under Consideration. In this case, (iii) becomes (c . f’ = c . f ' . In 
other words, the derivative of a constant times / is the constant times the derivative off. 
Combining this with the fact that the derivative of a sum is the sum of derivatives [property 
(i)], we find that for every pair of constants c 5 and c 2 we have 

(A/ + C 2 g)' = cff' + c 2 g'- 

This is called the linearity property of the derivative, and it is analogous to the linearity 
property of the integral. Using mathematical induction, we can extend the iinearity 
property to arbitrary finite sums as follows: 

( n \r n 

2c< , /<) ss 2v/<> 

where t\ , . . . , c n are constants and fi , . . . , f n are functions with derivatives / f . 

Every derivative formula can be written in two ways, either as an equality between two 
functions or as an equality involving numbers. The properties of Theorem 4.1, as written 
above, are equations involving functions. For example, property (i) states that the deriva- 
tive of the function f + g is the sum of the two functionsf’ and g’. When these functions 
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are evaluated at a point x, we obtain formulas involving numbers. Thus formula (i) 
implies 

if + £)'(*) = fix) + g'(x). 

We proceed now to the proof of Theorem 4.1. 

Proof of (i). Let x be a point where both derivativesf’ (x) and g’(x) exist. The difference 
quotient forf + g is 

f fix + h) + gjx + h) ] - [fix) + g(x)] _ fix + h)~ f(x) + g (x + h) - g(x) 
h li h 

When h — > 0 the first quotient on the right approachesf’(x), the second approaches g’(x), 
and hence the sum approachesf’ (x) + g’(x). This proves (i), and the proof of (ii) is similar. 

Proof of (iii). The difference quotient for the product f • g is 

fix + h)g(x + h) - f(x)gix) 

( 4 - 6 ) 7 

It 

To study this quotient as h > 0, we add and subtract in the numerator a term which enables 

us to write (4.6) as a sum of two terms involving difference quotients offand g. Adding 
and subtracting g(x) fix + h), we see that (4.6) becomes 


f{x + h)g(x + h) - fjx)g(x )^ f(x + h) - f(x ) + + gjx + h) - gjx) 

h h h 


When h — r 0 the first term on the right approaches g(v) fix). Sincefis continuous at x, 
we have/(x + h) —>• fix) , so the second term approachesf(x)g’(x). This proves (iii). 


Proof of (iv). A special case of (iv) occurs 
for all x and (iv) reduces to the formula 


( 4 -7) 



when f{x ) = 1 for all X. In this case fix) = 0 



provided g(x) ^ 0. We can deduce the general formula (iv) from this special case by 
writing fg as a product and using (iii), since 



= "•/' +/' 
g 



r 

g 


/• g' 


g-r -f-g’ 


Therefore it remains to prove (4.7). The difference quotient for 1 /g is 


[l/g(x + h)] - [l/gjx)] gjx + h) - gjx) . 1 1_ 


g(x) g(x + h) 


W) 
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When A-*0, the first quotient on the right approaches g’(x) and the third factor approaches 
1 lg(x). The continuity of g at x is required since we are using the fact that g(x + h) —> 
g(x) as /* — > 0. Hence the quotient in (4.8) approaches —g'{x)lg(x) 2 , and this proves (4.7). 

N ote: In order to write (4.8) we need to know that g(x + h) ^ 0 for all sufficiently small 

h , This follows from Theorem 3.7. 

Theorem 4.1, when used in conjunction with the examples worked out in Section 4.4, 
enables us to derive new examples of differentiation formulas. 

example 1. Polynomials. In Example 3 of Section 4.4 we showed that if f(x) = x n , 
where n is a positive integer, then f'(x) = nx n ~ 1 . The reader may find it instructive to 
rederive this result as a consequence of the special case n = 1, using mathematical induction 
in conjunction with the formula for differentiating a product. 

Using this result along with the linearity property, we can differentiate any polynomial 
by computing the derivative of each term and adding the derivatives. Thus, if 


f(x) =J,c k x k , 

k = 0 

then, by differentiating term by term, we obtain 


/'(*) = 1 kc k x k 1 . 


k = 0 


Note that the derivative of a polynomial of degree n is a new polynomial of degree n — 1. 
For example, if f(x) = 2x 3 + 5x 2 — 7x + 8, then f(x) = 6a' 2 + 10a — 7. 

example 2. Rational functions. If r is the quotient of two polynomials, say r(x) = 

p{x)jq{x), then the derivative r’(x) may be computed by the quotient formula (iv) in 

Theorem 4.1. The derivative r’(x) exists at every x for which the denominator q(x) 0. 
Note that the functions' so defined is itself a rational function. In particular, when r(xj = 
l jx m , where m is a positive integer and x ^ 0, we find 

r'( x ) = x m • 0 - -ffl 

x 2 m X m+1 ' 

If this is written in the form r’(x) = — it provides an extension from positive 

exponents to negative exponents of the formula for differentiating nth powers. 

example 3. Rational powers . Let f(x) = x r for x > 0, where r is a rational number. 

We have already proved the differentiation formula 

(4.9) f’M = rx r ~ l 

for r = l In, where n is a positive integer. Now we extend it to all rational powers. The 
formula for differentiating a product shows that Equation (4.9) is also valid for r = 2/n 
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and, by induction, for f = w/n, where m is any positive integer. (The induction argument 
refers to m.) Therefore Equation (4.9) is valid for all positive rational f. The formula 
for differentiating a quotient now shows that (4.9) is also valid for negative rational f. 
Thus, if f(x) = v 2 / 3 , we have f(x) = lx -1 / 3 . If f(x) = X“ I/2 , then f(x) = — \x~ 3 l 2 . In 
each case, we require X > 0. 


4.6 Exercises 

1. If f{x) = 2 + x - x 2 , compute /'(0), /'($), /'(l),/'(-10). 

2. If /(x) = lx 3 + \x z — 2x, find all x for which (a )f'(x) = 0; (b) f'{x) = -2; (c)f (x) = 10. 

In Exercises 3 through 12, obtain a formula iorf\x) if fix) is described as indicated. 

x 


4. f(x) = 

5. f(x) = 
6 •/(*) = 


7. m=-s 


x 2 + 3x + 2. 

8. fix) = 

x 4 + sin x. 

9. fix) = 

x 4 sin x. 

10. f(x) = 

1 

, X 5*5 - 1 • 

X + 1 

11. J’(x) = 

: , 1 + X 5 COS X. 

X 2 + 1 

12. fix) = 


x — 1 

I 


X* 1. 


2 + COS x ’ 
x 2 + 3x + 2 
x 4 + X 2 + 1 
2 — sin x 
2 — cos x ‘ 
x sin x 


+ X' 


.2 ’ 


13. Assume that the height fit) of a projectile, t seconds after being tired directly upward from the 
ground with an initial velocity of v 0 ft/sec, is given by the formula 


f‘(t) = VqI — I6/ 2 . 


(a) Use the method described in Section 4.2 to show that the average velocity of the projectile 
during a time interval from t to t + h is v 0 — 32t — 16/i ft/sec, and that the instantaneous 
velocity at time t is v Q — 32t ft/sec. 

(b) Compute (in terms of t; 0 ) the time required for the velocity to drop to zero. 

(c) What is the velocity on return to earth? 

(d) What must the initial velocity be for the projectile to return to earth after 1 sec? after 
10 sec? after T sec? 

(e) Show that the projectile moves with constant acceleration. 

(f) Give an example of another formula for the height which will lead to a constant accelera- 
tion of -20 ft/sec/sec. 

14. What is the rate of change of the volume of a cube with respect to the length of each edge? 

15. (a) The area of a circle of radius f is -nr 2 and its circumference is 2 nr. Show that the rate of 
change of the area with respect to the radius is equal to the circumference. 

(b) The volume of a sphere of radius f is 4nr 3 /3 and its surface area is 47rr 2 . Show that the 
rate of change of the volume with respect to the radius is equal to the surface area. 

In Exercises 16 through 23, obtain a formula for f’(x) if f(x) is defined as indicated. 

16. /(x) = Vx, x > 0. 18. f(X) = x 372 , x > 0. 


17. /(x) = 1 + Vx 


x > 0. 


19. f(X) = x~ 3/2 , 


x > 0. 
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20. f(x) = x l/2 + x 1/3 + a: 1/4 x > o. 


22. f{x) = 


Vx 

1 + x ’ 


x > 0. 


2 1. f(x) = x~ in + X~ m + x > 0. 


23. f(x) = 



x > 0. 


24. Let /i , . . . > f n be K functions having derivatives f ' , . . , f' n . Develop a rule for differentiating 
the product g = f\ f n and prove it by mathematical induction. Show that for those points 
x, where none of the function values ffx), .... f n (x) are zero, we have 


-fjQ + | Qx) 

g(x) ~ fjx) ^ ^ f n { x ) ' 


25. Verify the entries in the following short table of derivatives. It is understood that the formulas 
hold for those x for which jlx) is defined. 


f(x) 

fix) 

fix) 

/ « 

tan x 

sec 2 x 

sec x 

tan x sec x 

cot X 

-CSC 2 x 

CSC X 

—cot X CSC x 


In Exercises 26 through 35, compute the derivative fix). It is understood that each formula 
holds for those x for which f(x) is defined. 


26. fix) = 

tan x sec x. 

31. fix) 

X 

27. Jlx) = 

x tan x. 

32. Jlx) 

1 


x + sin x 


1 2 3 


ax + b 

28. Jlx) = 

X X 2 x 3 

33. f(x) 

cx + d ’ 

29. f(x) -- 

lx 

34. fix) 

cos X 

"l-x 2- 

“ 2x 2 + 3 ‘ 


l + x - X 2 

35. f(x) 

ax 2 + bx + c 

30. Jlx) = 


1 - X + X 2 ' 

sin * + cos * ' 


36. If f(x) = (ax + b) sin x + (cx + d) COS x, determine values of the constants a, b, c, d such 
that fix) = x cos x. 

37. If g(x) = ( ax 2 -j- bx + c) sin x + ( dx 2 + ex + J) cos x. determine values of the constants 
a, b, c, d, e, J such that g’(x) = x l sin x. 

38. Given the formula 

X «-H _ j 

1 + X + X 2 + • • • + x n = 

x — l 


(valid if x t 4 1), determine, by differentiation, formulas for the following sums: 

(a) 1 + 2x + 3x 2 + • • + nx n ~ x , 

(b) l 2 x + 2 2 x 2 + 3 2 X 3 + • • • + n 2 x n . 
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39. Let f(x) = x n , where n is a positive integer. Use the binomial theorem to expand (x + h) n 
and derive the formula 


fix , h) - f(x) _ 


= nx‘ 


■n — 1 -|- 


rt(n — 1) 


+ 


nxh n 2 -f /} n ~ 1 


Express the sum on the right in summation notation. Let h ->. 0 and deduce that f'(x) = nx n ~ 1 . 
State which limit theorems you are using. (This result was derived in another way in Example 
3 of Section 4.4.) 


4.7 Geometric interpretation of the derivative as a slope 

The procedure used to define the derivative has a geometric interpretation which leads in 
a natural way to the idea of a tangent line to a curve. A portion of the graph of a function 
fis shown in Figure 4.4. Two of its points P and Q are shown with respective coordinates 


/ 



Figure 4.4 Geometric interpretation of the 
difference quotient as the tangent of an angle. 



Figure 4.5 Lines of various dopes. 


(x,f(x)) and (x + h,f(x + h)). Consider the right triangle with hypotenuse PQ: its 
altitude, f{x + h) — fix), represents the difference of the ordinates of the two points Q 
and P. Therefore, the difference quotient 


(4.10) 


f{x + h) - fjx) 

h 


represents the trigonometric tangent of the angle a that PQ makes with the horizontal. 
The real number tan a is called the slope of the line through P and Q and it provides a 
way of measuring the “steepness” of this line. For example, if/ is a linear function, say 
fix) = mx + b, the difference quotient (4.10) has the value m, SO m is the slope of the 
line. 

Some examples of lines of various slopes are shown in Figure 4.5. For a horizontal line. 
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a = 0 and the slope, tan a, is also 0. If a lies between 0 and t, the line is rising as we move 
from left to right and the slope is positive. If a lies between §7t and tt, the line is falling as 
we move from left to right and the slope is negative. A line for which a = \tt has slope 1. 
As a increases from 0 to \tt, tan a increases without bound, and the corresponding lines 
of slope tan a approach a vertical position. Since tan \tt is not defined, we say that vertical 
lines haue no dope. 

Suppose now that f has a derivative at x. This means that the difference quotient 
approaches a certain limit f ‘(x) as h approaches 0. When this is interpreted geometrically 
it tells us that, as h gets nearer to 0, the point P remains fixed, Q moves along the curve 
toward P, and the line through PQ changes its direction in such a way that its slope 
approaches the number f ‘(x) as a limit. For this reason it seems natural to define the dope 
of the curve at P to be the numberf ‘(x). The line through P having this slope is called the 
tangent line at P. 

Note: The concept of a line tangent to a circle (and to a few other special curves) was 
considered by the ancient Greeks. They defined a tangent line to a circle as a line having 
one of its points on the circle and all its other points outside the circle. From this defini- 
tion, many properties of tangent lines to circles can be derived. For example, we can prove 
that the tangent at any point is perpendicular to the radius at that point. However, the 
Greek definition of tangent line is not easily extended to more general curves. The method 
described above, where the tangent line is defined in terms of a derivative, has proved to 
be far more satisfactory. Using this definition, we can prove that for a circle the tangent 
line has all the properties ascribed to it by the Greek geometers. Concepts such as per- 
pendicularity and parallelism can be explained rather simply in analytic terms making use 
of slopes of lines. For example, from the trigonometric identity 


tan (a — 0) = 


tan a — tan /9 
1 + tan a tan f) ’ 


it follows that two nonvertical lines with the same slope are parallel. Also, from the 
identity 


cot (a — j}) = 


1 + tan a tan jS 
tan a — tan /S ’ 


we find that two non vertical lines with slopes having product — 1 are perpendicular. 


The algebraic sign of the derivative of a function gives us useful information about the 
behavior of its graph. For example, if x is a point in an open interval where the derivative 
is positive, then the graph is rising in the immediate vicinity of x as we move from left to 
right. This occurs at x 3 in Figure 4.6. A negative derivative in an interval means the 
graph is falling, as shown at x 1; while a zero derivative at a point means a horizontal tangent 
line. At a maximum or minimum, such as those shown at x 2 , X 5 , and x 6 , the slope must be 
zero. Fermat was the first to notice that points like x 2 , X 5 , and x 6 , where f has a maximum 
or minimum, must occur among the roots of the equation f’(x) = 0. It is important to 
realize that / ‘(x) may also be zero at points where there is no maximum or minimum, such 
as above the point ,\' 4 . Note that this particular tangent line crosses the graph. This is an 
example of a situation not covered by the Greek definition of tangency. 
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/'(*s) = 0 



The foregoing remarks concerning the significance of the algebraic sign of the derivative 
may seem quite obvious when we interpret them geometrically. Analytic proofs of these 
statements, based on general properties of derivatives, will be given in Section 4.16. 

4.8 Other notations for derivatives 

Notation has played an extremely important role in the developmenr of mathematics. 
Some mathematical symbols, such as x n or n !, are merely abbreviations that compress long 
statements or formulas into a short space. Others, like the integration symbol j'J f(x) dx, 
not only remind us of the process being represented but also help us in carrying out 

computations. 

Sometimes several different notations are used for the same idea, preference for one 
or another being dependent on the circumstances that surround the use of the symbols. 
This is especially true in differential calculus where many different notations are used for 
derivatives. The derivative of a function f has been denoted in our previous discussions 
by f a notation introduced by J. L. Lagrange (1736-1813) late in the 18th Century. This 
emphasizes the fact that f 1 is a new function obtained from f by differentiation, its value 
at x being denoted by f ‘00- Each point (x, y) on the graph off has its coordinates x and 
y related by the equation y = f (x), and the symbol y’ is also used to represent the derivative 
f (f)i Similarly, y , y (n> represent the higher derivatives f’(x), . . . , f w (x). For 

example, if y = sin x, then y’ = cos x, y " = -sin x, etc. Lagrange’s notation is not too 
far removed from that used by Newton who wrote y and y, instead of y’ and y ", Newton’s 
dots are still used by some authors, especially to denote velocity and acceleration. 

Another symbol was introduced in 1800 by L. Arbogast (1759-1803) who denoted the 
derivative off by Df a symbol that has widespread use today. The symbol D is called a 
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differentiation operator, and it helps to suggest that Df is a new function obtained from / 
by the operation of differentiation. Higher derivatives / ", f”‘, . . . are written Off 
D'f , ...» D"f. respectively, the values of these derivatives at x being written D~f(x), 
D : f(x ), . . . , D'f(x). Thus, we have D sin x = cos x and D 2 sin x = D cos X = —sin x. 
The rule for differentiating a sum of two functions becomes, in the D-notation, D(f + g) = 
Df + Dg. Evaluation of the derivatives at x leads to the formula [D(f + g)](x) = 
Df{x) + Dg(x) which is also written in the form D[f{x) + g(x)] = Df{x ) + Dg{x). The 
reader may easily formulate the product and quotient rules in the D-notation. 

Among the early pioneers of mathematical analysis, Leibniz, more than anyone else, 
understood the importance of well-chosen symbols. He experimented at great length and 
carried on extensive correspondence with other mathematicians, debating the merits or 
drawbacks of various notations. The tremendous impact that calculus has had on the 
development of modern mathematics is due in part to its well-developed and highly 
suggestive symbols, many of them originated by Leibniz. 

Leibniz developed a notation for derivatives quite different from those mentioned above. 
Using y for fix), he wrote the difference quotient 

f(x +h) -f{x) 

h 

in the form 

Ay- 
Ax ’ 

where Ax (read as “delta x”) was written for A, and Ay for fix + h) -f(x). The symbol 
A is called a difference operator. For the limit of the difference quotient, that is, for the 
derivativef’ (x), Leibniz wrote dyjdx. In this notation, the definition of derivative becomes 

dy .. Ay 
— = lim..~ . 
d x az->oAx 

Not only was Leibniz’s notation different, but his way of thinking about derivatives was 
different. He thought of the limit dyjdx as a quotient of “infinitesimal” quantities dy and 
dx called “differentials,” and he referred to the derivative dyjdx as a “differential quotient.” 
Leibniz imagined infinitesimals as entirely new types of numbers which, although not zero, 
were smaller than every positive real number. 

Even though Leibniz was not able to give a satisfactory definition of infinitesimals, he 
and his followers used them freely in their development of calculus. Consequently, many 
people found calculus somewhat mysterious and began to question the validity of the 
methods. The work of Cauchy and others in the 19th Century gradually led to the replace- 
ment of infinitesimals by the classical theory of limits. Nevertheless, many people have 
found it helpful to try to think as Leibniz did in terms of infinitesimals. This kind of 
thinking has intuitive appeal and often leads quickly to results that can be proved correct 
by more conventional means. 

Recently Abraham Robinson has shown that the real number system can be extended 
to incorporate infinitesimals as envisaged by Leibniz. A discussion of this extension and its 
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impact on many branches of mathematics is given in Robinson’s book. Non-standard 
Analysis, North-Holland Publishing Company, Amsterdam, 1966. 

Although some of Leibniz’s ideas fell into temporary disrepute, the same cannot be said 
of his notations. The symbol dyjdx for the derivative has the obvious advantage that it 
summarizes the whole process of forming the difference quotient and passing to the limit. 
Later we shall find the further advantage that certain formulas become easier to remember 
and to work with when derivatives are written in the Leibniz notation. 


4.9 

1 . 

2 . 

3. 

4. 

5. 

6 . 


7. 


9. 


10 . 


11 . 


Exercises 

Let f(x) = I* 3 — 2 jc 2 + 3x + 1 for all x. Find the points on the graph off at which the 
tangent line is horizontal. 

Let f(x) = |x 3 + lx 2 — x 1 for all x. Find the points on the graph of f at which the slope 
is: (a) 0; (b) -1; (c) 5. 

Let fix) = x + sin x for all x. Find all points x for which the graph of f at (x,f(x)) has slope 
zero. 

Let f(x) = x 2 + ax + b for all X. Find values of a and b such that the line y = 2x is tangent 
to the graph off’ at the point (2, 4). 

Find values of the constants a, b, and c for which the graphs of the two polynomials fix) = 
x 2 + ax + b and g(x) = x 3 — c will intersect at the point (1,2) and have the same tangent 
line at that point. 

Consider the graph of the function f defined by the equation f(x) = x 2 + ax + b, where a 
and b are constants. 

(a) Find the slope of the chord joining the points on the graph for which x = x^ and x = x 2 . 

(b) Find, in terms of x x and x 2 . all values of x for which the tangent line atfx^/Cxjlhas the 
same slope as the chord in part (a). 

Show that the line y = —x is tangent to the curve given by the equation y = x 3 — 6x 2 + 8x. 
Find the point of tangency. Does this tangent line intersect the curve anywhere else? 

Make a sketch of the graph of the cubic polynomial lYxj = x — x 3 over the closed interval 
-2 <x < 2. Find constants m and b such that the line y = mx + b will be tangent to the 
graph off at the point (- 1,0). A second line through (— 1,0) is also tangent to the graph off 
at a point (a, c). Determine the coordinates a and c. 

A function / is defined as follows: 

( £ 2 jf* x < c 

fix) = + , .. - (a, b, C constants) . 

J ax b if * > c. 


Find values of a and b (in terms of c) such that f'ic) exists. 
Solve Exercise 9 when f is defined as follows: 


/(*) = 


w 

a + bx 2 


if |*| > c , 
if |*| < c . 


Solve Exercise 9 when / is defined as follows: 


fix) = 


'sin x 

ax + b 


if x < c , 
if x > c . 
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12. If f(x) = (1 — x)l( 1 + V x) for x > 0, find formulas for Df(x), D 2 f(x), and D s f(x). 

13. There is a polynomial P(x) = ax 3 + bx 2 + cx + d SUCh that P(0) = P(l) = -2. P'(0) = -1, 
and P”(0) = 10. Compute a, b, c, d. 

14. Two functions f and g have first and second derivatives at 0 and satisfy the relations 

/( 0) = 2lg(0 ) , f '(0) = 2gX0) = 4g(0) , g "(0) = 5/"(0) = 6/(0) = 5 . 


(a) Let h(x) = f(x)lg(x), and compute h'(0). 

(b) Let k(x) = f(x)g(x) sin x, and compute k’(0). 

(c) Compute the limit of g'{x)jf'{x) as x -r 0. 

15. Given that the derivative f'(a) exists. State which of the following statements are true and 
which are false. Give a reason for your decision in each case. 


(a) f(a) -- 
O’) /'(«) 


h-* a h 0 

f(a) — f(a - h ) 


lim 
A-* 0 


h 


(c) J”(u) = lim 

t-o 

(d) f(u) = lim 

t-o 


f(a + 2t) -f(u) 

t 

f(a + 2 1) -f(fl + t) 

2 1 


16. Suppose that instead of the usual definition of the derivative Df(x), we define a new kind of 
derivative, D*f(x), by the formula 


D*f(x ) = I i m ^ + 

Ji-0 


h) -/ 2 W 
h 


where / \x) means [f(x)f. 

(a) Derive formulas for computing the derivative D* of a sum. difference, product, and 
quotient. 

(b) Express D*f(x) in terms of Df(x). 

(c) For what functions does D*f = Df? 


4.10 The chain rule for differentiating composite functions 

With the differentiation formulas developed thus far, we can find derivatives of functions 
f for which fix) is a finite sum of products or quotients of constant multiples of sin x, 
COS x, and x r (r rational). As yet, however, we have not learned to deal with something 
like fix) = sin (x 2 ) without going back to the definition of derivative. In this section we 
shall present a theorem, called the chain rule, that enables us to differentiate composite 
functions such as fix) = sin (,y 2 ). This increases substantially the number of functions 
that we can differentiate. 

We recall that if U and v are functions such that the domain of u includes the range of v, 
we can define the composite function /= u 4v by the equation 

fix) = «M*)] ■ 

The chain rule tells us how to express the derivative off in terms of the derivatives u’ and if . 

theorem 4.2. chain rule. Let f be the composition of two functions u and v, say 
f = u ° v. Suppose that both derivatives v’(x) and u’(y) exist, where y = v(x). Then the 
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derivative f ‘(x) also exists and is given by the formula 
(4.11) f(x) = u’(y) 'v’(x) . 

In other words, to compute the derivative of u o v at X, we first compute the derivative of 
u at the point y, where y = v(x), and multiply this by v’(x). 

Before we discuss the proof of (4.1 1), we shall mention some alternative ways of expressing 
the chain rule formula. If we write (4.11) entirely in terms of x, we obtain the formula 

f(x) = u'M*)] ’ v’(x) . 

Expressed as an equation involving functions rather than numbers, the chain rule assumes 
the following form 


(u o v)’ = (u’ o v) 1 v\ 

In the u(v)-notation, let us write u(v)' for the derivative of the composite function u(v) and 
u'(v ) for the composition u' o v. Then the last formula becomes 


u(v)’ = u’(v)‘v'. 


Proof of Theorem 4.2. We turn now to the proof of (4.11). We assume that v has a 
derivative at x and that u has a derivative at v(x), and we wish to prove thatf has a derivative 
at x given by the product jc)] • v’(x). The difference quotient for f is 

(4 12) /(x + h) - fix ) _ u[v(x + h)] - i#Q ] 

h h 


It is helpful at this stage to introduce some new notation. Let y = v(x) and let k = 
v(x + h) — v(x). (It is important to realize that k depends on h.) Then we have 
v(x + h) = y + k and (4.12) becomes 

4 13 fix + h) - fix) __ u(y + k) - u(y ) 

( ' h h 

The right-hand side of (4.13) resembles the difference quotient whose limit defines u'(y) 
except that li appears in the denominator instead of k. If k ^ 0, it is easy to complete the 
proof. We simply multiply numerator and denominator by k, and the right-hand side of 
(4.13) becomes 


(4.14) 


ujy + k) — u{y) k _ u(y + k) — u(y) . vjx + h) — v(x) 

k h ~ k h 


When h — >■ 0, the last quotient on the right tends to v’(x). Also, k : —*■ 0 as h — r 0 because 
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k = v(x + h) ~ v(x) and v is continuous at x. Therefore the first quotient on the right 
of (4.14) approaches u'(y) as /? — > 0, and this leads at once to (4.11). 

Although the foregoing argument seems to be the most natural way to proceed, it is not 
completely general. Since k = v(x + h) — v(x), it may happen that k = 0 for infinitely 
many values of h as h — > 0, in which case the passage from (4.13) to (4.14) is not valid. 
To overcome this difficulty, a slight modification of the proof is needed. 

Let us return to Equation (4.13) and express the quotient on the right in a form that 
does not involve k in the denominator. For this purpose we introduce the difference 
between the derivative u'(y) and the difference quotient whose limit is «'()’)■ That is, we 
define a new function g as follows: 

(4.15) sW - U(y ' " W - ■/(,.) if ,*0 


This equation defines g(t) only if t ^ 0. Multiplying by t and rearranging terms, we may 
write (4.15) in the following form: 

(4.16) u(y + t) - u{y) = t[g(t) + «'(>’)] > 

Although (4.16) has been derived under the hypothesis that t ^ 0, it also holds for t = 0, 
provided we assign some definite value to g(0). Since g(f) — > 0 as t — > 0, we shall define g(0) 
to be 0. This will ensure the continuity of g at 0. If, now, we replace t in (4.16) by k, where 
k = v(x -f- It) — v(x), and substitute the right-hand side of (4.16) in (4.13), we obtain 


(4.17) 


+ »)-/W K... 


, 7.01 


a formula that is valid even if k = 0. When /j — 0 the quotient k/h — > v\x) and g(k) —>■ 0 
so the right-hand side of (4.17) approaches the limit u{y) 1 t/(.\j. This completes the proof 
of the chain rule. 


4.11 Applications of the chain rule. Related rates and implicit differentiation 

The chain rule is an excellent example to illustrate the usefulness of the Leibniz notation 
for derivatives. In fact, if we write (4.1 1) in the Leibniz notation, it assumes the appearance 
of a trivial algebraic identity. First we introduce new symbols, say 

y = v(x) and z = u(y) . 

Then we write dy/dx for the derivative v’(x), and dzjdy for «'(>’). The formation of the 
composite function is indicated by writing 

z = u(y) = u[v(x )] = f(x) , 

and dzjdx is written for the derivative f’(x). The chain rule, as expressed in Equation 
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(4.1 1), now becomes 
(4.18) 


dz dz dy 

dx dy dx 


The strong suggestive power of this formula is obvious. It is especially attractive to people 
who use calculus in physical problems. For example, suppose the foregoing symbol z 
represents a physical quantity measured in terms of other physical quantities x and y. 
The equation z = fix) tells us how to find z if x is given, and the equation z = u(y) tells 
us how to find z if y is given. The relation between x and y is expressed by the equation 
y = u(x). The chain rule, as expressed in (4.18), tells us that the rate of change of z with 
respect to x is equal to the product of the rate of change of z with respect to y and the rate 
of change of y with respect to x. The following example illustrates how the chain rule may 
be used in a special physical problem. 


example 1 . Suppose a gas is pumped into a spherical balloon at a constant rate of 50 
cubic centimeters per second. Assume that the gas pressure remains constant and that the 
balloon always has a spherical shape. How fast is the radius of the balloon increasing 
when the radius is 5 centimeters? 


Solution. Let r denote the radius and V the volume of the balloon at time t. We are 
given dVjdt, the rate of change of volume with respect to time, and we want to determine 
drjdt, the rate of change of the radius with respect to time, at the instant when r = 5. The 
chain rule provides the connection between the given data and the unknown. It states that 


(4.19) 


dV _ (IV dr 
dt dr dt 


To compute dVjdr, we use the formula V = 4t7T 3 /3 which expresses the volume of the sphere 
in terms of its radius. Differentiation gives us dV/dr = 4rrr 2 , and hence (4.19) becomes 


d V 2 dr 
— = 477T 2 — 
dt dt 


Substituting dVfdt = 50 and r = 5, we obtain drjdt = 1 /(2-tt-), That is to say, the radius is 
increasing at a rate of 1 /(2tt) centimeters per second at the instant when r = 5. 

The foregoing example is called a problem in related rates. Note that it was not necessary 
to express r as a function of t in order to determine the derivative drjdt. It is this fact that 
makes the chain rule especially useful in related-rate problems. 

The next two examples show how the chain rule may be used to obtain new differentiation 
formulas. 


EXAMPLE 2. Givenf(x) = sin (x 2 ), compute f'(x). 

Solution. The function f is a composition,f(x) = w[ij(.y)], where v(x) = x 2 and u(x) = 
sin x. Td use the chain rule, we need to determine i/[u(x)] = u(x 2 ). Since u’(x) = COS x, 
we have t/(.t 2 )= cos (x 2 ), and hence (4.11) gives us 


fix) = COS (x 2 ) • u’(x) = COS (x 2 ) • 2x. 
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We may also solve the problem using the Leibniz notation. If we write y = x 2 and z = f(x), 
then z = sin y and dz\dx = f'(x). The chain rule yields 

— = — — = (COS y)(2x) = COS(x").2x. 

dx dydx V 

which agrees with the foregoing result for f'(x). 

example 3. If f(x) = [V(x> ] ", where n is a positive integer, compute f(x) in terms of 
v(x) and v’(x). 

Solution. The function f is a composition, fix) = u[v(x)], where u(x) = x". Since 
u'(x) = nx K-1 , we have u'[v(x)] = «[u(x)]" _1 , and the chain rule yields 

f’(x) = «[tf(x)]" _ V(*) ■ 

If we omit the reference to X and write this as an equality involving functions, we obtain 
the important formula 

( v n )' = hu” _ V 

which tells us how to differentiate the nth power of v when v’ exists. The formula is also 
valid for rational powers if v n and t?" -1 are defined. To solve the problem in the Leibniz 
notation, we write y = v(x) and z = fix). Then z = y”, dzjdx = f ‘(x), and the chain rule 
gives us 

dz dz dy „_i , , 

di = TyTx = n y ( ’W' 

which agrees with the first solution. 

example 4. The equation x 2 + y 2 = r~ represents a circle of radius r and center at the 
origin. If we solve this equation for y in terms of x, we obtain two solutions which serve 
to define two functions / and g given on the interval [-r, r] by the formulas 

f(x) = V7 2 — x 2 and g(x) = — V r 2 — x 2 . 

(The graph off is the upper semicircle and the graph of g the lower semicircle.) We may 
compute the derivatives off and g by the chain rule. For / we use the result of Example 3 
with v(x) = r l — X 2 and n = \ to obtain 


(4.20) f’(x) = i(r 2 - x 2 r 1/2 (- 2x) = - 7 =^== 

f(x) 


whenever /(x) ^ 0. The same method, applied to g, gives us 
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whenever g(x) ^ 0. Notice that if we let y stand for eitherf(x) or g(x), then both formulas 
(4.20) and (4.21) can be combined into one, namely, 

(4.22) y’ = — if y ^ 0. 

7 

Another useful application of the chain rule has to do with a technique known as implicit 
diferentiation. We shall explain the method and illustrate its advantages by rederiving 
the result of Example 4 in a simpler way. 

EXAMPLE 5. Implicit differentiation. Formula (4.22) may be derived directly from the 
equation j 2 + y 2 = r 2 without the necessity of solving for y. We remember that y is a 
function of x [either y = f(x) or y = g(x)]. A ssuming that y’ exists, we differentiate both 
sides °f the equation x 2 + y ! = r 2 to obtain 

(4.23) 2x + 2 yy' = 0 . 

(The term 2 yy' cornes from differentiating y 2 as explained in Example 3.) When Equation 
(4.23) is solved for y’ it yields (4.22). 

The equation x 2 + y 2 = r 2 is said to define y implicitly as a function of x (it actually 
defines two functions), and the process by which (4.23) is obtained from this equation is 
called implicit differentiation. The end result is valid for either of the two functionsfand g 
so defined. Notice that at a point (x, y) on the circle with x ^0 and y 0, the tangent 
line has a slope — x/y, whereas the radius from the center to (x, y) has the slope y/x. The 
product of the two slopes is -1 so the tangent is perpendicular to the radius. 


4.12 Exercises 

In Exercises 1 through 14, determine the derivativef’ (x). In each case it is understood that x is 
restricted to those values for which the formula for f(x) is meaningful. 

X x 

1. fix ) = cos 2x -2sinx . 8. f(x) = tan - - cot 

2. f(x) = V 1 + x 2 . 9. f(x) = sec 2 x + esc 2 x. 

3. f(x) = (2 — x 2 ) cos x 2 + 2x sin x 3 . 10. f(X) = xff 1 + x 2 . 


4. fix) = sin (cos 2 x) cos (sin 2 x). 

5. fix) = sin" x . cos nx. 


6. fix) = sin [sin (sin *)]. 


7. ffx) = 


sin 2 x 
sin x 2 ' 


"•'w-vTT?- 

/ 1 + x 3 V> 3 
12. f(x) =( . 

V 1 + x 2 ix + V 1 + x 2 ) 


14 - /(*) = Jx + V x + 
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15. Compute fix) if fix) = (1 + x)(2 + x 2 ) 1/2 (3 + x 3 ) 1/3 , x 3 7 s ' 3 ' 

16. Let f(x) = y f"[~ ^ x ^ 0, and let g(x) = t + ■ Compute fix) and g’(x). 

17. The following table of values was computed for a pair of functions f and g and their deriva- 
tivesf’ and g’. Construct a corresponding table for the two composite functions h and k 
given by h(x ) = f[g(x)l k(x) = g[f(x)]. 


X 

/(*) 

fix) 

gix) 

gXx) 

0 

1 

5 

2 

-5 

1 

S 

-2 

0 

1 

2 


1 

3 

1 

3 

2 

4 

1 

-6 


18. A functionfand its first two derivatives are tabulated as shown. Let g(x) = x/(x 2 ) and make 
a table ofg and its first two derivatives for x = 0, 1,2. 



19. Determine the derivativeg’(x) in terms off (x) if: 

(a) gix) = f(x 2 ) ; (c) g(x) = /[/(*)]; 

( b ) g ( x ) = /(sin 2 x) + /(cos 2 x); (d) g(x) =/{/[/ (x)]}. 

Related rates and implicit differentiation. 

20. Each edge of a cube is expanding at the rate of 1 centimeter (cm) per second. How fast is the 
volume changing when the length of each edge is (a) 5 cm? (b) 10 cm? (c) x cm? 

21. An airplane flies in level flight at constant velocity, eight miles above the ground. (In this 
exercise assume the earth is flat.) The flight path passes directly over a point P on the ground. 
The distance from the plane to P is decreasing at the rate of 4 miles per minute at the instant 
when this distance is 10 miles. Compute the velocity of the plane in miles per hour. 

22. A baseball diamond is a 90-foot square. A ball is batted along the third-base line at a constant 
speed of 100 feet per second. How fast is its distance from first base changing when (a) it is 
halfway to third base? (b) it reaches third base? 

23. A boat sails parallel to a straight beach at a constant speed of 12 miles per hour, staying 4 
miles offshore. How fast is it approaching a lighthouse on the shoreline at the instant it is 
exactly 5 miles from the lighthouse? 
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24. A reservoir has the shape of a right-circular cone. The altitude is 10 feet, and the radius of the 
base is 4 ft. Water is poured into the reservoir at a constant rate of 5 cubic feet per minute. 
How fast is the water level rising when the depth of the water is 5 feet if (a) the vertex of the 
cone is up? (b) the vertex of the cone is down? 

25. A water tank has the shape of a right-circular cone with its vertex down. Its altitude is 10 feet 
and the radius of the base is 15 feet. Water leaks out of the bottom at a constant rate of 1 
cubic foot per second. Water is poured into the tank at a constant rate of c cubic feet P er 
second. Compute c so that the water level will be rising at the rate of 4 feet per second at the 
instant when the water is 2 feet deep. 

26. Water flows into a hemispherical tank of radius 10 feet (flat side up). At any instant, let h 
denote the depth of the water, measured from the bottom, r the radius of the surface of the 
water, and V the volume of the water in the tank. Compute dVjdh at the instant when h = 5 
feet. If the water flows in at a constant rate of 5 V'^3 cubic feet per second, compute drjdt, 
the rate at which r is changing, at the instant t when h = 5 feet. 

27. A variable right triangle ABC in the xy-plane has its right angle at vertex B, a fixed vertex 

A at the origin, and the third vertex C restricted to lie on the parabola y = 1 + x 2 . The 

point B starts at the point (0, 1) at time t = 0 and moves upward along the y-axis at a constant 
velocity of 2 cm/sec. How fast is the area of the triangle increasing when t = 7/2 sec? 

28. The radius of a right-circular cylinder increases at a constant rate. Its altitude is a linear 
function of the radius and increases three times as fast as the radius. When the radius is 1 
foot the altitude is 6 feet. When the radius is 6 feet, the volume is increasing at a rate of 1 
cubic foot per second. When the radius is 36 feet, the volume is increasing at a rate of n cubic 
feet per second, where «is an integer. Compute n. 

29. A particle is constrained to move along a parabola whose equation is y = x 2 . (a) At what 

point on the curve are the abscissa and the ordinate changing at the same rate? (b) Find this 
rate if the motion is such that at time t we have x = sin t and y = sin 2 1. 

30. The equation x s + y 3 = 1 defines y as one or more functions of x. (a) Assuming the derivative 

y’ exists, and without attempting to solve for y, show thaty’ satisfies the equation x 2 + y 2 y' = 0. 
(b) Assuming the second derivative y” exists, show that y” = —2 xy 5 whenever y ^ 0. 

31. If 0 < x <5, the equation x 1/2 + y 1/2 = 5 defines y as a function of x. Without solving for y, 

show that the derivative y’ has a fixed sign. (You may assume the existence of y\) 

32. The equation 3x 2 + 4 y 2 = 12 defines y implicitly as two functions of x if |x| < 2. Assuming 
the second derivative y” exists, show that it satisfies the equation 4y 3 y" = -9. 

33. The equation x sin xy + 2x 2 = 0 defines y implicitly as a function of x. Assuming the deriva- 
tive y’ exists, show that it satisfies the equation y'x 2 cos xy + xy cos xy + sin xy + 4x = 0. 

34. If y = x r , where r is a rational number, say r = m/n, then y n = x m . Assuming the existence 
of the derivative y\ derive the formula y’ = rx r_1 using implicit differentiation and the corre- 
sponding formula for integer exponents. 


4.13 Applications of differentiation to extreme values of functions 

Differentiation can be used to help locate maxima and minima of functions. Actually, 
there are two different uses of the word “maximum’' in calculus, and they are distinguished 
by the two prefixes absolute and relative. The concept of absolute maximum was introduced 
in Chapter 3. We recall that a real-valued functionfis said to have an absolute maximum 
on a set S if there is at least one point c in S such that 

f{x) <f(c ) for all x in S . 

The concept of relative maximum is defined as follows. 
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definition of relative maximum. A function f defined on a set S, is said to have a 
relative maximum at a point c in S if there is some open interval l containing c such tl mt 

f(x) <f( c ) for all x u ’hich lie in 1 n S. 


The concept of relative minimum is similarly defined by reversing the inequality. 


In other words, a relative maximum at c is an absolute maximum in some neighborhood 
of c, although this need not be an absolute maximum on the whole of S. Examples are 
shown in Figure 4.7. Of course, every absolute maximum is, in particular, a relative 


maximum. 




Figure 4.7 Extrema of functions. 


definition OF extremum. A number winch is either a relative maximum or a relative 
minimum of a function f is called an extreme value or an extremum off. 

The next theorem, which is illustrated in Figure 4.7, relates extrema of a function tO 
horizontal tangents of its graph. 

THEOREM 4.3. VANISHING OF THE DERIVATIVE AT AN INTERIOR EXTREMUM. Let f be 

defined on an open interval /, and assume thatfhas a relative maximum or a relative minimum 
at an interior point c of I. If the derivative f ‘(c) exists, then f ‘(c) = 0. 

Proof. Define a function Q on / as follows: 

Q(x) = — — if x ^ c, Q(c) = f'(c) 

X - c 

Since f ‘(c) exists, Q(x) — > Q(c ) as x -> c, so Q is continuous at c. We wish to prove that 
Q(c) = 0. We shall do this by showing that each of the inequalities Q(c) > 0 and Q(c) < 0 
leads to a contradiction. 
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Assume Q(c) > 0. By the sign-preserving property of continuous functions, there is an 
interval about c in which Q(x) is positive. Therefore the numerator of the quotient Q(x) 
has the same sign as the denominator for all x ^ c in this interval. In other words, 
f ( X ) > f(c) when x > c, and fix) <f(c) when x < c. This contradicts the assumption 

that / has an extremum at c. Hence, the inequality Q(c) > 0 is impossible. A similar 
argument shows that we cannot have Q(c) < 0. Therefore Q(c) = 0, as asserted. Since 
Q(c) = f’(c), this proves the theorem. 


It is important to realize that a zero derivative at c does not imply an extremum at c. 
For example, let fix) = x 3 , The graph off is shown in Figure 4.8. Here f’(x) = 3x 2 , SO 


y 



Figure 4.8 Here/'(0) equals 
0 but there is no extremum 
at 0. 


Y 



Figure 4.9 There is an ex- 
tremum at 0, but f’(0) does 
not exist. 


f'(0) =0. However, this function is increasing in every interval containing 0 so there is 

no extremum at 0. This example shows that a zero derivative at c is not sufficient for an 
extremum at c. 

Another example, fix) = |x|, shows that a zero derivative does not always occur at an 
extremum. Here there is a relative minimum at 0, as shown in Figure 4.9, but at the point 
0 itself the graph has a sharp corner and there is no derivative. Theorem 4.3 assumes that 
the derivative f’(c) exists at the extremum. In other words. Theorem 4.3 tells us that, in 
the absence of sharp corners, the derivative must necessarily vanish at an extremum if this 
extremum occurs in the interior of an interval. 

In a later section we shall describe a test for extrema which is comprehensive enough to 
include both the examples in Figure 4.7 and also the example in Figure 4.9. This test, 
which is described in Theorem 4.8, tells us that an extremum always occurs at a point 
where the derivative changes its sign. Although this fact may seem geometrically evident, 
a proof is not easy to give with the materials developed thus far. We shall deduce this 
result as a consequence of the mean-value theorem for derivatives which we discuss next. 


4.14 The mean-value theorem for derivatives 

The mean-value theorem for derivatives holds a position of importance in calculus 
because many properties of functions can easily be deduced from it. Before we state the 
mean-value theorem, we will examine one of its special cases from which the more general 
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theorem will be deduced. This special case was discovered in 1690 by Michel Rolle 
(1652-1 719), a French mathematician. 


theorem 4.4. rolles theorem. Let f be a function which is continuous everywhere 
on a closed interval [a, b] and has a derivative at each point of the open interual (a, b). Also, 
assume that 

m=m. 

Then there is at least one point c in the open interval (a, b) such that f‘(c)~ 0. 


The geometric significance of Rolle’s theorem is illustrated in Figure 4.10. The theorem 
simply asserts that the curve shown must have a horizontal tangent somewhere between 
a and b. 




figure 4.10 Geometric interpre- 
tation of Rolle's theorem. 


Figure 4.11 Geometric significance of the mean-value 
theorem. 


Proof, We assume that / ‘(x) A 0 for every x in the open interval (a, b), and we arrive 
at a contradiction as follows: By the extreme-value theorem for continuous functions, / 
must take on its absolute maximum M and its absolute minimum m somewhere in the 
closed interval [a, b]. Theorem 4.3 tells us that neither extreme value call be taken at any 
interior point (otherwise the derivative would vanish there). Hence, both extreme values 
are taken on at the endpoints a and b. But since / (a) =f (b), this means that m =z M, and 
hence /is constant on [a, b]. This contradicts the fact that / ‘(x) A 0 for all x in (a, b). It 
follows that f(c) = 0 for at least one c satisfying a < c < b, which proves the theorem. 

We can use Rolle’s theorem to prove the mean-value theorem. Before we state the 
mean-value theorem, it may be helpful to examine its geometric significance. Each of the 
curves shown in Figure 4.11 is the graph of a continuous function / with a tangent line 
above each point of the open interval (a, b). At the point (c, / (c)) shown in Figure 4.1 1(a), 
the tangent line is parallel to the chord A B. In Figure 4. 1 1(b), there are two points where 
the tangent line is parallel to the chord AB. The mean-value theorem guarantees that 
there will be at least onepoint with this property. 

To translate this geometric property into an analytic statement, we need only observe 
that parallelism of two lines means equality of their slopes. Since the slope of the chord 
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AB is the quotient [f (b) — / (a)]l(b -^) and since the slope of the tangent line at c is the 
derivative / ‘(c), the above assertion states that 


(4.24) 


m -m 


= /'(c) 


for some c in the open interval (a, b). 

To exhibit strong intuitive evidence for the truth of (4.24) we may think off(t) as the 
distance traveled by a moving particle at time t. Then the quotient on the left of (4.24) 
represents the mean or average speed in the time interval [a, b ], and the derivative f’(t) 
represents the instantaneous speed at time t. The equation asserts that there must be 
some moment when the instantaneous speed is equal to the average speed. For example, 
if the average speed during an automobile trip is 45 mph, then the speedometer must 
register 45 mph at least once during the trip. 

The mean-value theorem may be stated formally as follows. 


THEOREM 4.5. MEAN-VALUE THEOREM FOR DERIVATIVES. A SSUme that f IS Continuous 
everywhere on a closed interval [a, b] and has a derivative at each point of the open interval 
(a, b). Then there is at least one interior point c of (a, b) for which 

(4.25) fib) -fia) = f’(c)(b — a). 

Proof. To apply Rolle’s theorem we need a function which has equal values at the 
endpoints a and b. To construct Such a function, we modify /as follows. Let 

h(x) =f(x)(b - a) - x[f{b ) -fia)] . 

Then h(a) = h(b) = bf (a) afib). Also, h is continuous on [a, b] and has a derivative 
in the open interval (a, b). Applying Rolle’s theorem to h, we find that h'(c) = 0 for some 
c in (a. b). But 


h '( x ) =f'(x)(b - a) - [fib) -fia)] . 

When x = c, this gives us Equation (4.25). 

Notice that the theorem makes no assertion about the exact location of the one or more 
“mean values’’ c, except to say that they all lie somewhere between a and b. For some 
functions the position of the mean values may be specified exactly, but in most cases it is 
very difficult to make an accurate determination of these points. Nevertheless, the real 
usefulness of the theorem lies in the fact that many conclusions can be drawn from the 
knowledge of the mere existence of at least one mean value. 

Note: It is important to realize that the conclusion of the mean-value theorem may fail 

to hold if there is any point between a and b where the derivative does not exist. For ex- 
ample, the function / defined by the equation / (xj = |x| is continuous everywhere on the 
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real axis and has a derivative everywhere except at 0. Let A = ( 1)) and let 6 = 

(2 ,f(2)). The slope of the chord joining A and B is 

f(2) — /( — 1) _ 2 - 1 _1 
2 - (-1) " 3 “3 

but the derivative is nowhere equal to 
The following extension of the mean-value theorem is often useful. 

theorem 4.6. cauchy’S MEAN-VALUE formula. Let f and g be two functions con- 
tinuous on a closed interval [a, b] and having derivatives in the open interval (a, b). Then, for 
some c in (a, b), we have 


f'(c)[g(b)-g(a)] = g'{c)[f{b) -/(«)] . 

Proof. The proof is similar to that of Theorem 4.5. We let 

h(x) =f{x)[g{b) - g(a)] - g(x)[f(b) ■ f (a)} . 

Then h(a) = h(b) = f{a)g(b ) — g{a)f(b). Applying Rolle’s theorem to h, we find that 
h ’(c) = 0 for some c in (a, b). Computing h ’(c) from the formula defining h, we obtain 
Cauchy’s mean-value formula. Theorem 4.5 is the special case obtained by taking g(x) = x. 


4.15 Exercises 

1. Show that on the graph of any quadratic polynomial the chord joining the points for which 
X = a and x = b is parallel to the tangent line at the midpoint a = (a + b)j 2. 

2. Use Rolle’s theorem to prove that, regardless of the value of b, there is at most one point x 
in the interval -1 < x < 1 for which x 3 — 3x + b = 0. 

3. Define a functionfas follows: 

3 2 | 

f(x) = — ~~ if a < 1, f(x) = — if x > 1 . 

(a) Sketch the graph off for x in the interval 0 < x <; 2. 

(b) Show that /satisfies the conditions of the mean-value theorem over the interval [0, 2] 
and determine all the mean values provided by the theorem. 

4. Let f(x) = [ — * Z/ I. Show that f(l) = f( — 1) = 0, but that /'( a) is never zero in the interval 
[ -1, 1], Explain how this is possible, in view of Rolle’s theorem. 

5. Show that x 2 = x sin x + cos x for exactly two real values of x. 

6. Show that the mean-value formula can be expressed in the form 

f{x + h) = fix) + hflx + dh) w h e r e 0 < 8 < 1 . 

Determine 6 in terms of x and h when (a) f(x) = a 2 ; (b) fix) = x 3 . Keep x fixed, x ^0, and 
find the limit of 0 in each case as h -* 0. 

7. Let /be a polynomial. A real number a is said to be a zero off of multiplicity m if f{x) = 
(x — a) m glx), where gl**-) ^ 0. 
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(a) Iff has r zeros in an interval [a, b], prove that f has at least r— 1 zeros, and in general, 
the kth derivative / (fc) has at least r — k zeros in [a, b], (The zeros are to be counted as often 
as their multiplicity indicates.) 

(b) If the kth derivative fW has exactly r zeros in [a, b], what can you conclude about the 
number of zeros off in [a, b]? 

8. Use the mean-value theorem to deduce the following inequalities: 

(a) |sin;t - sin_yj < |x — y\. 

(b) ny n -\x - y) < x n -y” < nx n ~\x -y) if 0 < y < x, n= 1,2, 3, ... . 

9. A function /continuous on [a, b], has a second derivativef' everywhere on the open interval 
(a, b). The line segment joining (a, f(n)) and (b, f(b)) intersects the graph of /at a third point 
(c,/(c)), where a < c < b. Prove that f"(t) = 0 for at least one point t in (a, b). 

10. This exercise outlines a proof of the intermediate-value theorem for derivatives. Assume f 
Ims a derivative everywhere on an open interval I ■ Choose a < b in /. Then f takes on every value 
between f'{a) and fib) somewhere in (a, b). 

(a) Define a new function g on [a, b] as follows: 

g(x) = ^ if x ^ a, g(a) =f'(a) . 

Prove that g takes on every value between f(a) and g(b) in the open interval (a, b). Use the 
mean- value theorem for derivatives to show thatf takes on every value between fia) and g(b) 
in the open interval (a, b). 

(b) Define a new function h on [a, 6] as follows: 

f{x) - f(b) 

h{ x) = ' _ b if x * b, h(b) =f'(b) ■ 

By an argument similar to that in part (a), show that f takes on every value between f ’ (b) 
and h(u) in (a, b). Since h(a) = g(b), this proves the intermediate-value theorem for derivatives. 


4.16 Applications of the mean-value theorem to geometric properties of functions 

The mean-value theorem may be used to deduce properties of a function from a 
knowledge of the algebraic sign of its derivative. This is illustrated by the following 
theorem. 


theorem 4. 7. Letf be a function which is continuous on a closed interval [a, b] and assume 
f has a derivative at each point of the open interval ( a, b). Then we have: 

(a) If f l x) )> 0 for every x in (a, b), f is strictly increasing on [a, b]; 

(b) If f\x) < 0 for every x in (a, b), f is strictly decreasing on [a, b]; 

(c) Iff lx) - 0 for every x in (a, b), f is constant throughout [a, b]. 

Proof To prove (a) we must show that f (x) <f(y) whenever a < x < y < b. There- 

fore, suppose x < y and apply the mean-value theorem to the closed subinterval [x, y]. 
We obtain 

(4.26) fly) - f{x) = flc)ly - x), where x < c < y . 

Since both f ‘(c) and y — x are positive, so is f (y) ~~f(x), and this means f (x) < f(y), as 
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asserted. This proves (a), and the proof of (b) is similar. Tb prove (c), we use Equation 
(4.26) with x = a. Since f'(c) = 0, we have/(j) = f{a) for every y in [a, b\, so f is constant 

on [a, b]. 

We can use Theorem 4.7 to prove that an extremum occurs whenever the derivative 
changes sign. 

theorem 4.8. Assumef is continuous on a closed interval [a, b] and assume that the 
derivative f exists everywhere in the open interval (a, b), except possibly at a point c. 

(a) ///> is osipvetbr jail x < c and negative far all x > c, then f has a relative 
maximum at c. 

(b) If on the other hand, f’(x) is negative far all x < c and positive far all x > c, then f 
has a relative minimum at c. 

Proof. In case (a), Theorem 4.7(a) tells us that f is strictly increasing on [a, c] and 
strictly decreasing on [c, b]. Hence f(x) < f(c) for all X^c in (a, b), so / has a relative 



a c b a c b 


(a) Relative maximum at c (b) Relative minimum at c 

Figure 4.12 An extremum occurs when the derivative changes sign. 

maximum at c. This proves (a) and the proof of(b) is entirely analogous. The two cases 
are illustrated in Figure 4.12. 

4.17 Second-derivative test for extrema 

If a function / is continuous on a closed interval la, b], the extreme-value theorem tells 
us that it has an absolute maximum and an absolute minimum somewhere in [a, b], Iff 
has a derivative at each interior point, then the only places where extrema can occur are: 

(1) at the endpoints a and b; 

(2) at those interior points x where f ‘(x) = 0 . 

Points of type (2) are often called critical points off. To decide whether there is a maximum 
or a minimum (or neither) at a critical point c, we need more information about f Usually 
the behavior off at a critical point can be determined from the algebraic sign of the 
derivative near c. The next theorem shows that a study of the sign of the second derivative 
near c can also be helpful. 
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THEOREM 4.9. SECOND-DERIVATIVE TEST FOR AN EXTREMUM AT A CRITICAL POINT. Let 

c be a criticalpoint off in an open interval (a, b); that is, assume a < c b andf ‘(c) = 0. 
Assume also that the second derivative f" exists in (a, b). Then we have the following: 

(a) Iff" is negative in (a, b), f has a relative maximum at c. 

(b) If f" is positive in (a, b), f has a relative minimum at c. 

The two cases are illustrated in Figure 4.12. 

Proof. Consider case (a), f "<0in (a,b). By Theorem 4.7 (applied to f '), the function 
f 1 is strictly decreasing in (a, b). But f ' (c) =0, so f changes its sign from positive to 

negative at c, as shown in Figure 4.12(a). Hence, by Theorem 4.8, / has a relative maximum 
at c. The proof in case (b) is entirely analogous. 

If f" is continuous at c, and if f “(c) ^ 0, there will be a neighborhood of c in which / " 
has the same sign as f"{c). Therefore, \( f'(c) = 0, the function f has a relative maximum 
at c if f “(c) is negative, and a relative minimum if f"(c) is positive. This test suffices for 
many examples that occur in practice. 

The sign of the second derivative also governs the convexity or the concavity off. The 
next theorem shows that the function is convex in intervals where f " is positive, as illustrated 
by Figure 4.12(b). In Figure 4.12(a), f I S concave because f" is negative. It suffices to 
discuss only the convex case, because iff is convex, then ■ f /" S concave. 

theorem 4.10. derivative test for convexity. Assume f is continuous on [a, b\and 

has a derivative in the open interval (a, b). Iff is increasing on (a, b), then f is convex on 
[a, b ]. In particular, f is convex if f" exists and is nonnegative in (a, b). 

Proof. Take x < y in [a, b] and let z = ay + ( I — a).v, where 0 < a < 1. We wish 
to prove that f(z) < af (y) + (1 — a)/ (x). Since /(c) = a f (z) + (1 ™ a)/ (z), this is the 

same as proving that 

(1 - a)[/(z) -/(*)] < a [/(>() --/(?)] • 

By the mean-value theorem (applied twice), there exist points c and d satisfying x < c < z 
and z < d < y such that 

f(z) -f(x) = f(c)(z - X ), and f(y) -f(z) = f(d)(y - z ) . 

Since /' is increasing, we have / ‘(c) Kf’(d). Also, we have (1 — a)(z — .y) = a(y — z), so 
we may write 

(1 - a )[/(z) -/(*)] = (1 - a)/'(c)(z - x) < zf'(d)(y - z ) = a [f(y) -f( z ) ], 

which proves the required inequality for convexity. 


4.18 Curve sketching 

The information gathered in the theorems of the last few sections is often useful in curve 
sketching. In drawing the graph of a function f we should first determine the domain off 
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[the set of x for which f(x) is defined] and, if it is easy to do so, we should find the range 
of f (the set of values taken on byf). A knowledge of the domain and range gives us an 
idea of the extent of the curve y = f(x), since it specifies a portion of the xy-plane in which 
the entire curve must lie. Then it is a good idea to try to locate those points (if any) where 
the curve crosses the coordinate axes. These are called intercepts of the graph. The 
y-intercept is simply the point (0,/(0)), assuming 0 is in the domain off, and the x-intercepts 
are those points (x, 0) for which f(x) = 0. Computing the x-intercepts may be extremely 
difficult in practice, and we may have to be content with approximate values only. 

We should also try to determine intervals in whichfis monotonic by examining the sign 
off’, and to determine intervals of convexity and concavity by studying the sign off ", 
Special attention should be paid to those points where the graph has horizontal tangents. 

example 1. The graph of y = f(x), where f(x) = x + Iff for x ^ 0. 

In this case, there are no intercepts on either axis. The first two derivatives are given by 
the formulas 

f’(x) =1 — 1 / A' 2 , f"{x) = 2ff 3 . 


y 



The first derivative is positive if x 2 > 1, negative if X 2 < 1, and zero if x 2 = 1. Hence 
there is a relative minimum at x = 1 and a relative maximum at x = — 1. For x > 0, 
the second derivative is positive so the first derivative is strictly increasing. For x < 0, the 
second derivative is negative, and therefore the first derivative is strictly decreasing. For 
x near 0, the term x is small compared to 1 ff, and the curve behaves like the curve y = 1 ff. 
(See Figure 4.13.) On the other hand, for very large x (positive or negative), the term Iff 
is small compared to x, and the curve behaves very much like the line y = x. In this 
example, the function is odd, f( — x) = -fix), so the graph is symmetric with respect to 
the origin. 

In the foregoing example, the line y = x is an asymptote of the curve. In general, a 
nonvertical line with equation y = mx + b is called an asymptote of the graph of y = fix) 
if the differencef(x) — (mx + b) tends to 0 as x takes arbitrarily large positive values or 
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arbitrarily large negative values. A vertical line, x = a, is called a vertical asymptote if 
f(x) | takes arbitrarily large values as x -> a from the right or from the left. In the foregoing 
example, the y-axis is a vertical asymptote. 


EXAMPLE 2. The graph of y —f (x), where f (x) = 1 j(x 2 + 1). 

This is an even function, positive for all x, and has the x-axis as a horizontal asymptote. 
The first derivative is given by 


/'(*) 


-2 x 

(x 2 + 1 f ’ 


so f(x) < 0 if x > 0, f’(x) > 0 if x < 0, and f(x) = 0 when x = 0. Therefore the 
function increases over the negative axis, decreases over the positive axis, and has a relative 
maximum at x = 0. Differentiating once more, we find that 


n ,_(x 2 + 1) 2 (— 2) - (— 2x)2(x 2 + l)(2x) _ 2(3x 2 - 1) 

7 W “ (x 2 + l) 4 “ (x 2 + l) 3 . 

Thus /“(x) > 0 if 3.V 2 > 1, and / “(x) < 0 if 3x 2 < 1- Hence, the first derivative increases 
when x 2 > i and decreases when x 2 <' i- This information suffices to draw the curve in 
Figure 4.14. The two points on the graph corresponding to x 2 = where the second 
derivative changes its sign, are called points of inflection. 


4.19 Exercises 


In the following exercises, (a) find all points x such that J”(x) = 0; (b) examine the sign off 
and determine those intervals in which/is monotonic; (c) examine the sign off’ and determine 
those intervals in which /' is monotonic; (d) make a sketch of the graph of f. In each case, the 
function is defined for all x for which the given formula for fix) is meaningful. 


1. f(x) = x 2 - 3x + 2. 

2. f(xj = x 3 - 4x. 

3. f(x) = (x - l) 2 (x + 2). 

4. f(x) = x 3 — 6x 2 + 9x + 5. 

5. f(x) = 2 + (x - I)*. 

6. f(x) = 1/x 2 . 

7. f(x) = x + 1/x 2 . 


9. fix) = xj(l+ x 2 ). 

10. f{x) = (x 2 - 4)/(x 2 9). 

11. f(x) = sin 2 x 

12. f(x) = x — sin x. 

13. fix) = x + cos x. 

14. f(X) = J-x 2 + h COS 2x. 


4.20 Worked examples of extremum problems 

Many extremum problems in both pure and applied mathematics can be attacked 
systematically with the use of differential calculus. As a matter of fact, the rudiments of 
differential calculus were first developed when Fermat tried to find general methods for 
determining maxima and minima. We shall solve a few examples in this section and give 
the reader an opportunity to solve others in the next set of exercises. 

First we formulate two simple principles which can be used to solve many extremum 
problems. 
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example 1 . Constant-sum, maximum-product principle. Given a positive number S. 
Prove that among all choices of positive numbers x and y with x + y = S, the product xy 
is largest when x = y = A 5. 

Proof. If X + y = S, then y = S — x and the product xy is equal to x(S — x) = 

xS — x 2 . Let f(x) = xS — x 2 . This quadratic polynomial has first derivative f’(x) = 

S — 2x which is positive for x < and negative for x > IS. Hence the maximum of 

xy occurs when x = \ S , y = S — x = \S. This can also be proved without the use of 

calculus. We simply write f(x) = — (x — \ S ) 2 and note that f(x) is largest when 

x = IS. 


example 2. Constant-product, minimum-sum principle. Given a positive number P. 
Prove that among all choices of positive numbers x and y with xy = P, the sum x + y is 
smallest when x = y = V P, 

Proof. We must determine the minimum of the function f(x) = x + Pjx for x > 0. 
The first derivative is f (x) = 1 — Pjx 2 , This is negative for x 2 < P and positive for 
X 2 > P, so fix) has its minimum at x = \/~P. Hence, the sum x + y is smallest when 

x = y = Vf. 

example 3. Among all rectangles of given perimeter, the square has the largest area, 

Proof. We use the result of Example 1. Let x and y denote the sides of a general 
rectangle. If the perimeter is fixed, then x + y is constant, so the area xy has its largest 
value when x = y. Hence, the maximizing rectangle is a square. 

example 4. The geometric mean of two positive numbers does not exceed their arith- 
metic mean, That is, Vab < \{a + b). 

Proof. Given a > 0, b > 0, let P = ab. Among all positive x and y with xy = P, the 
sum x + y is smallest when x = y = \ P. In other words, if xy = P, then x + y > 

VjP + VP = 2 \fp. In particular, a + b > 2 VP = 2\7~ab, SO + b). Equality 

occurs if and only if a = b. 

example 5. A block of weight W is to be moved along a flat table by a force inclined 
at an angle 8 with the line of motion, where 0 < Q < r, as shown in Figure 4.15. Assume 
the motion is resisted by a frictional force which is proportional to the normal force with 
which the block presses perpendicularly against the surface of the table. Find the angle 0 
for which the propelling force needed to overcome friction will be as small as possible. 

Solution. Let F(8) denote the propelling force. It has an upward vertical component 
F(8) sin 6, so the net normal force pressing against the table is N = W — F( 8) sin 0, The 
frictional force is p N, where p (the Greek letter mu) is a constant called the coefficient of 
friction. The horizontal component of the propelling force is F(8) COS 6. When this is 
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equated to the frictional force, we get F(8) COS 6 = p[ IV — F(8) si n 0] from which we find 


F(6) 


pW 

cos 0 + p sin 0 


To minimize F(B), we maximize the denominator g(fj) = cos 0 + p sin 6 in the interval 
0 < 0 < l r77. At the endpoints, we have g(0) = 1 and g(\n) = p. In the interior of the 
interval, we have 

g'(6) ■■= -sine + pcosd, 


so g has a critical point at d = a, where sin a = p cos a, This gives g( a) = cos rx + 
p % COS a = (1 + p 2 ) COS a. We can express cos a in terms of p. Since p 2 cos 2 ot = sin 2 a = 
1 — COS 2 a, we find (1 + p 2 ) COS 2 a = 1, so cos a = \/V 1 + p 2 . Thus g( a) = \/ \ p 2 . 



Since g( a) exceeds g(0) and g(%ir), the maximum of g occurs at the critical point. Hence the 
minimum force required is 


F( a) 


pW _ pW 
g(«) Vl + /i 2 


example 6. Find the shortest distance from a given point (0, b) on the y-axis to the 
parabola x 2 = 4 y. (The number b may have any real value.) 

Solution. The parabola is shown in Figure 4.16. The quantity to be minimized is the 
distance d, where 

d := Vx 2 + (y - b) 2 , 


subject to the restriction x 2 = 4 y. It is clear from the figure that when b is negative the 
minimum distance is \b\. As the point (0, b ) moves upward along the positive y-axis, 
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the minimum is b until the point reaches a certain special position, above which the 
minimum is <b. The exact location of this special position will now be determined. 

First of all, we observe that the point (x, y) that minimizes d also minimizes d 2 . (This 
observation enables us to avoid differentiation of square roots.) At this stage, we may 
express d 2 in terms of x alone or else in terms of y alone. We shall express d 2 in terms of 
y and leave it as an exercise f° r the reader to carry out the calculations when d 2 is expressed 
in terms of x. 

Therefore the functionfto be minimized is given by the formula 

f(y) = d 2 = 4y + (y - bf . 

Although /(y) is defined for all real y, the nature of the problem requires that we seek the 
minimum only among those y > 0. The derivative, given by f\y) = 4 + 2(y — b), is zero 
only when y = b — 2. When b < 2, this leads to a negative critical point y which is 
excluded by the restriction y > 0. In other words, if b < 2, the minimum does not occur 
at a critical point. In fact, when b < 2, we see that f’ (y) > 0 when y > 0, and hence 
j is strictly increasing for y > 0. Therefore the absolute minimum occurs at the endpoint 
y = 0. The corresponding minimum d is V b 2 = |6|. 

If b > 2, there is a legitimate critical point at y = b — 2. Since f’(y) = 2 for all y, 
the derivative f is increasing, and hence the absolute minimum off occurs at this critical 
point. The minimum d is V 4(1) — 2) + 4 = 2 V b — 1 . Thus we have shown that the 
minimum distance is |ft| if b <2 and is 2 Vo - 1 if b 2. (The value b = 2 is the special 
value referred to above.) 


4.21 Exercises 

1 . Prove that among all rectangles of a given area, the square has the smallest perimeter. 

2. A farmer has Lfeet of fencing to enclose a rectangular pasture adjacent to a long stone wall. 
What dimensions give the maximum area of the pasture? 

3. A farmer wishes to enclose a rectangular pasture of area A adjacent to a long stone wall. What 
dimensions require the least amount of fencing? 

4. Given S > 0. Prove that among all positive numbers x and y with x + y = S, the sum 
X 2 + y 2 is smallest when x = y. 

5. Given R > 0. Prove that among all positive numbers x and y with x 1 + >' 2 = R, the sum 
x + y is largest when x = y. 

6. Each edge of a square has length L. Prove that among all squares inscribed in the given 
square, the one of minimum area has edges of length \L\^2. 

7. Each edge of a square has length L, Find the size of the square of largest area that can be 
circumscribed about the given square. 

8. Prove that among all rectangles that can be inscribed in a given circle, the square has the 
largest area. 

9. Prove that among all rectangles of a given area, the square has the smallest circumscribed 
circle. 

10. Given a sphere of radius R. Find the radius Y and altitude h of the right circular cylinder with 
largest lateral surface area 2-rrrh that can be inscribed in the sphere. 

11. Among all right circular cylinders of given lateral surface area, prove that the smallest circum- 
scribed sphere has radius yfl times that of the cylinder. 
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12. Given a right circular cone with radius R and altitude H. Find the radius and altitude of the 
right circular cylinder of largest lateral surface area that can be inscribed in the cone. 

13. Find the dimensions of the right circular cylinder of maximum volume that can be inscribed in 
a right circular cone of radius R and altitude H. 

14. Given a sphere of radius R. Compute, in terms of R, the radius r and the altitude h of the 
right circular cone of maximum volume that can be inscribed in this sphere. 

15. Find the rectangle of largest area that can be inscribed in a semicircle, the lower base being on 
the diameter. 

16. Find the trapezoid of largest area that can be inscribed in a semicircle, the lower base being on 
the diameter. 

17. An open box is made from a rectangular piece of material by removing equal squares at each 
corner and turning up the sides. Find the dimensions of the box of largest volume that can 
be made in this manner if the material has sides (a) 10 and 10; (b) 12 and 18. 

18. If a and b are the legs of a right triangle whose hypotenuse is 1, find the largest value of la + b. 

19. A truck is to be driven 300 miles on a freeway at a constant speed of x miles per hour. Speed 
laws require 30 <x< 60. Assume that fuel costs 30 cents per gallon and is consumed at the 
rate of 2 + x 2 l 600 gallons per hour. If the driver’s wages are I) dollars per hour and if he 
obeys all speed laws, find the most economical speed and the cost of the trip if (a) D = 0, 
(b) D = 1, (c) D = 2, (d) D = 3, (e) D = 4. 

20. A cylinder is obtained by revolving a rectangle about the x-axis, the base of the rectangle 
lying on the x-axis and the entire rectangle lying in the region between the curve y = xj(x 2 + 1) 
and the x-axis. Find the maximum possible volume of the cylinder. 

21. The lower right-hand corner of a page is folded over so as to reach the leftmost edge. (See 
Figure 4.17.) If the width of the page is six inches, find the minimum length of the crease. 
What angle will this minimal crease make with the rightmost edge of the page? Assume the 
page is long enough to prevent the crease reaching the top of the page. 



Figure 4.17 Exercise 21 


22. (a) An isosceles triangle is inscribed in a circle of radius r as shown in Figure 4.18. If the 
angle 2 a at the apex is restricted to lie between 0 and | ,tr find the largest value and the smallest 
value of the perimeter of the triangle. Give lull details of your reasoning. 
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(b) What is the radius of the smallest circular disk large enough to cover every isosceles 
triangle of a given perimeter L ? Give full details of your reasoning. 

23. A window is to be made in the form of a rectangle surmounted by a semicircle with diameter 
equal to the base of the rectangle. The rectangular portion is to be of clear glass, and the 
semicircular portion is to be of a colored glass admitting only half as much light per square 
foot as the clear glass. The total perimeter of the window frame is to be a fixed length P. Find, 
in terms of P, the dimensions of the window which will admit the most light. 

24. A log 12 feet long has the shape of a frustum of a right circular cone with diameters 4 feet and 
(4 + h) feet at its ends, where h > 0. Determine, as a function of h, the volume of the largest 
right circular cylinder that can be cut from the log, if its axis coincides with that of the log. 

25. Given n real numbers a„ , a,. Prove that the sum 'fj: i ( x — fl/j 2 is smallest when x is 
the arithmetic mean of a L , . . . , a,. 

26. If x > 0, let /(A) = 5x 2 + Ax~'\ where A is a positive constant. Find the smallest A such that 
f(x) > 24 for all x > 0. 

27. For each real t, let f(x) = —lx 3 + fix, and let m(t) denote the minimum of f(x) over the 
interval 0 < x < 1. Determine the value of m(t) for each t in the interval -1 < t < 1. 
Remember that for some values of t the minimum off(x) may occur at the endpoints of the 
interval 0 < x < 1. 

28. A number x is known to lie in an interval a <x<b, where a > 0. We wish to approximate 
x by another number t in [a, b] SO that the relative error, \t — x\jx, will be as small as possible. 
Let M(t) denote the maximum value of \t — x\jx as x varies from a to b. (a) Prove that this 
maximum occurs at one of the endpoints x = a or x = b. (b) Prove that M(t) is smallest when 

t is the harmonic mean of a and b, that is, when 1/7 = |(1 ja + 1 lb). 


*4.22 Partial derivatives 

This section explains the concept of partial derivative and introduces the reader to some 
notation and terminology. We shall not make use of the results of this section anywhere 
else in Volume 1, so this material may be omitted or postponed without loss in continuity. 

In Chapter 1, a function was defined to be a correspondence which associates with each 
object in a set X one and only one object in another set Y; the set X is referred to as the 
domain of the function. Up to now, we have dealt with functions having a domain consisting 
of points on the x-axis. Such functions are usually called functions of one real variable. It 
is not difficult to extend many of the ideas of calculus to functions of two or more real 
variables. 

By a real-valuedfunction of two real variables we mean one whose domain X is a set of 

points in the xy-plane. If f denotes such a function, its value at a point (x, y) is a real 
number, written f (x, y). It is easy to imagine how such a function might arise in a physical 
problem. For example, suppose a flat metal plate in the shape of a circular disk of radius 
4 centimeters is placed on the xy-plane, with the center of the disk at the origin and with 
the disk heated in such a way that its temperature at each point (x, y) is 16 — x 2 — y 2 
degrees centigrade. If we denote the temperature at (x, y) by f (x, y), then f is a function 
of two variables defined by the equation 

(4.27) f(x, y)= 16 - x 2 - }’ 2 - 

The domain of this function is the set of all points (x,y) whose distance from the origin 
does not exceed 4. The theorem of Pythagoras tells us that all points (x, y ) at a distance 
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Y from the origin satisfy the equation 

(4.28) x 2 + y 2 = r 2 . 

Therefore the domain in this case consists of all points (x, y) which satisfy the inequality 
X 2 + y 2 < 16. Note that on the circle described by (4.28), the temperature is fix, y) = 
16 — r 2 . That is, the function!' is constant on each circle with center at the origin. (See 
Figure 4.19.) 

We shall describe two useful methods for obtaining a geometric picture of a function of 
two variables. One is by means of a surface in space. To construct this surface, we introduce 
a third coordinate axis (called the z-axis); it passes through the origin and is perpendicular 



to the xy-plane. Above each point (x, y) we plot the point (x, y, z) whose z-coordinate is 
obtained from the equation z = f(x, y). 

The surface for the example deseribed above is shown in Figure 4.20. If we placed a 
thermometer at a point (x, y) on the plate, the top of the mercury column would just touch 
the surface at the point (x, y, z) where z = f(x, y) provided, of course, that unit distances 
on the z-axis are properly chosen. 

A different kind of picture of a function of two variables can be drawn entirely in the 
xy-plane. This is the method of contour lines that is used by map makers to represent a 
three-dimensional landscape by a two-dimensional drawing. We imagine that the surface 
described above has been cut by various horizontal planes (parallel to the xy-plane). They 
intersect the surface at those points (x, y, z) whose elevation z is constant. By projecting 
these points on the xy-plane, we get a. family of contour lines or level curves. Each level 
Curve consists of those and only those points (x, y) whose coordinates satisfy the equation 
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(a) z = xy (b) Level curves: xy = c 

Figure 4.21 (a) A surface whose equation is z = xy. (b) The corresponding level 

curves xy = constant. 


/(*> y) = c, where c is the constant elevation for that particular curve. In the example 
mentioned above, the level curves are concentric circles, and they represent curves’ of 
constant temperature, or isothermals, as might be drawn on a weather map. Another 
example of a surface and its level curves is shown in Figure 4.21. The equation in this case 
is z = xy. The “saddle-shaped” surface is known as a hyperbolicparaboloid. 

Contour lines on topographic maps are often shown for every 100 ft of elevation. When 
they are close together, the elevation is changing rapidly as we move from one contour to 
the next; this happens in the vicinity of a steep mountain. When the contour lines are far 
apart the elevation is changing slowly. We can get a general idea of the steepness of a 


z 



Figure 4.22 The curve of intersection of a surface z = f(x, y) and a plane y = y,. 
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landscape by considering the spacing ofits level curves. However, to get precise information 
concerning the rate of change of the elevation, we must describe the surface in terms of a 
function to which we can apply the ideas of differential calculus. 

The rate at which the elevation is changing at a point (x 0 , y 0 ) depends on the direction 
in which we move away from this point. For the sake of simplicity, we shall consider at 
this time just the two special directions, parallel to the x- and y-axes. Suppose we examine 
a surface described by an equation of the form z = f{x, y); let us cut Ibis surface with a 
plane perpendicular to the y-axis, as shown in Figure 4.22. Such a plane consists °f all 
points (x, y, z) in space for which the y-coordinate is constant, say y = y,,. (The equation 
y = y 0 is called an equation of this plane.) The intersection of this plane with the surface 
is a plane curve, all points of which satisfy the equation z =/(x, y 0 ). On this curve the 
elevation f(x, y 0 ) is a function of X alone. 

Suppose now we move from a point (x„ y 0 ) 1° a point (x, + h, y,,). The corresponding 
change in elevation is f(x 0 + h, y 0 ) — f(x 0 , y„). This suggests that we form the difference 
quotient 

(4 29) /(*o + lh hi ~ >’n) 

h 

and let h -* 0. If this quotient approaches a definite limit as h — > 0, we call this limit the 
partial derivative off with respect to x at (x„ y 0 ). There are various symbols that are used 
to denote partial derivatives, some of the most common ones being 


^ , /«(x 0 Jo), fjx 0 . }’o) , fi(x 0 , y 0 ), DJ(x 0 , j 0 ) 

OX 

The subscript 1 in the last two notations refers to the fact that only the first coordinate is 
allowed to change when we form the difference quotient in (4.29). Thus we have 


/iOo , Jo) = lim 
ft — '0 


f(x 0 + h, Jo) - f(x o , J 0 ) 

h 


Similarly, we define the partial derivative with respect to y at (x 0 , y 0 ) by the equation 


k-0 k 

alternative notations being 

• lu> ) f'y(x o , To) , fy(x o , Jo) , DJ(X 0 , Jo) . 

oy 

If we write z = fix, y), then dz/dx and dzjdy are also used to denote partial derivatives. 

Partial differentiation is not a new concept. If we introduce another function g of one 
variable, defined by the equation 


g(x) = f(x, Jo) , 




200 


Differential calculus 


then the ordinary derivative g'(x 0 ) is exactly the same as the partial derivative f(x 0 , y„). 
Geometrically, the partial derivative fix, y„) represents the slope of the tangent line at a 
typical point of the curve shown in Figure 4.22. In the Same way, when x is constant, Say 
x = x 0 , the equation z = f(x 0 , y) describes the curve of intersection of the surface with 
the plane whose equation is x = x 0 . The partial derivative / 2 (x 0 , y) gives the slope of the 
line tangent to this curve. From these remarks we see that to compute the partial derivative 
of f(x, y) with respect to x, we can treat y as though it were constant and use the ordinary 
rules of differential calculus. Thus, for example, if fix, y) = 16 — x l — we get 
fix, y) = -2x. Similarly, if we hold x fixed, we find fix, y) = —2 y. 

Another example is the function given by 

(4.30) f(x,y) = xsiny + y 2 cos xy ■ 

Its partial derivatives are 

ffx, y) = sin y y 3 sin xy , fix, y ) = x COS y — xy 2 sin xy + 2 y cos xy . 


Partial differentiation is a process which produces new functions f = dfjdx and 
f = df/dy from a given function f. Since f and f are also functions of two variables, we 
can consider their partial derivatives. These are called second-order partial derivatives of 
f denoted as follows: 

fi,i = fxx — 

Notice that f 2 means if) 2 , the partial derivative off, with respect to y. In the a-notation, 
we indicate the order of derivatives by writing 

a 2 / = 9/3A i 

dy dx dy 1 3.x/ 

This does not always yield the same result as the other mixed partial derivative, 

dx dy dx \dy/ ' 

However, equality of the two mixed partial derivatives does hold under certain conditions 
that are usually satisfied by most functions that occur in practice. We shall discuss these 
conditions further in Volume II. 

Referring to the example in (4.27), we find that its second-order partial derivatives are 
given by the following formulas: 


/l,2 — fxy — 


df 

dy dx ’ 


f, 1 — fyx — 


df 

dx dy ’ 


f,i — fyy ~ 


dff 

3 / 


f,iix, y) = -2, f/x, _p) = ffx, y) = 0, f 2 (x, y; = -2 , 



Exercises 


201 


For the example in (4.30), we obtain 
y) = -/ COS xy , 

/i y) = cos y — xy 3 cos xy — 3y 2 sin xy , 

/ 21 (.v, 7 ) = cos y — xy 3 COS xy — y 2 sin xy — 2y 2 sin xy — / 1 > 2 (.r, y) , 

/ 22 (x, y) = — X sin y — x 2 y 2 cos xy — 2 xy sin xy — 2xy sin xy + 2 cos xy 
= —x sin y — x 2 y 2 cos xy — 4 xy sin xy + 2 cos xy . 

A more detailed study of partial derivatives will be undertaken in Volume II. 


*4.23 Exercises 


In Exercises 1 through 8, compute all first- and second-order partial derivatives. In each case 
verify that the mixed partial derivatives/ >2 (x, y) and f 2 ,iix, y) are equal. 


1. fix, y) = x i + / - 4 x 2 y 2 . 

2- fix , y) = x sin (x + y). 

3. fix, y) = xy + ^ iy ^ 0). 


5. fix, y) = sin (x 2 y 3 ). 

6. fix, y) = sin [COS (2x — 3 y)\. 


4. fix, y) = V X 2 + /■ 8. fix, y) = * ix, y) * (0, 0). 

x x l + y i 

9. Show that x( dzj dx) + y ( dzj dy) = 2 z if (a) z = (x — 2yf, (b) z = (x 4 + _y 4 ) 1/2 . 

10. If/(x, y) = xylix 2 + j 2 ) 2 for (x, y) 5* (0, 0), show that 


tL + ^[ 

dx 2 dy 2 


= 0 . 
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THE RELATION BETWEEN INTEGRATION 
AND DIFFERENTIATION 


5.1 The derivative of an indefinite integral. The first fundamental theorem of calculus 

We come now to the remarkable connection that exists between integration and 
differentiation. The relationship between these tw0 processes is somewhat analogous to 
that which holds between “squaring” and “taking the square root.” If we square a positive 
number and then take the positive square root of the result, we get the original number 
back again. Similarly, if we operate on a continuous function / by integration, we get a 
new function (an indefinite integral off) which, when differentiated, leads back to the 
original function f. For example, if fi(x) = x 2 , then an indefinite integral A off may be 
defined by the equation 

f x C x y 3 r 2 

A(x) = I f(t) dt = I t 2 dt = --~, 

J c J c 3 3 

where c is a constant. Differentiating, we find A’(x) = x 2 = f(x). This example illustrates 
a general result, called the first fundamental theorem of calculus, which may be stated as 
follows : 

THEOREM 5.1. FIRST FUNDAMENTAL THEOREM OF CALCULUS. Let f be Cl function that is 
integrable on [a, x] for each x in [a, b]. Let c be such that a <c <^b and define a new 
function A as follows: 


A(x) = f fit ) dt if a < x < b . 

*>C 

Then the derivative A’(x) exists at eachpoint x in the open interval ((*• b) where f is continuous, 
andfor such x we have 

(5.1) A’(x) = f (x) . 

First we give a geometric argument which suggests why the theorem ought to be true; 
then we give an analytic proof. 
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Geometric motivation. Figure 5.1 shows the graph of a function/oVCr an interval la, hj. 
In the figure, h is positive and 

rx+h rx+h rx 

J f(t) dt = f(t) dt — J f(t) dt = A(x + h) — A(x) . 

The example shown is continuous throughout the interval [x, x + h]. Therefore, by the 
mean-value theorem for integrals, we have 

A(x + h) — A(x) = hf{z), where x < z < x + h . 


Hence we have 
(5.2) 


Mx + 


h ) - A(x) 

h 


= f(z) 





Figure 5.1 Geometric motivation for the first fundamental theorem of calculus. 


and, since X <z < x + h, we find that f(z) as h — >■ 0 through positive values. A 

similar argument is valid if /; — >■ 0 through negative values. Therefore, A’(x) exists and is 
equal to / (x). 

This argument assumes that the function / is continuous in some neighborhood of the 
point x. However, the hypothesis of the theorem refers only to continuity off at a single 
point x. Therefore, we use a different method to prove the theorem under this weaker 
hypothesis. 

Analytic Proof. Let x be a point of continuity off, keep x fixed, and form the quotient 

A(x + h) — A(x) 
h 

To prove the theorem we must show that this quotient approaches the limit / (x) as h — >■ 0. 
The numerator is 


rx+h rx rx+n 

A(x + h) - .4(x) = J c /(0 dt - l fit) dt = fit ) dt 
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If we write f(t) — fix) + [fit) -f(x)] in the last integral, we obtain 


rx+h rx+h 

A(x + h) - A(x) = ) x fix) dt + [/(/) - f(x)\ dt 

fx+h 

= hf(x) + [fit) - /(*)] dt , 


from which we find 

( 5 . 3 ) A(x + h) h ~ A(x) = m + - fix)] dt . 

Therefore, to complete the proof of (5.1), all we need to do is show that 

1 Cx+h 

11111 7 IM - fix)] dt = 0 . 

h -* o h J a 

It is this part of the proof that makes use of the continuity off at x. 

Let us denote the second term on the right of (5.3) by G(h). We are to prove that 
G(h) —>■ 0 as h — »■ 0. Using the definition of limit, we must show that for every e > 0 there 
is a <5 > 0 such that 

(5 4 ) \G(h)\ < e whenever 0 < \h\ < d . 

Continuity of/at x tells us that, if e is given, there is a positive d such that 

(5.5) 1/(0 “/Ml < 

whenever 

(5.6) x — S<t<x + d. 

If we choose h SO that 0 < h < 8, then every t in the interval [x, x + h] satisfies (5.6) and 
hence (5.5) holds for every such t. Using the property | j* h g(t) dt < jj r r / h \g(t)\ dt with 
g[t) —fit) — fix) we see that the inequality in (5.5) leads to the relation 

j rx+h I rx+h rx+h 

I J* /HO -fix)] dt\< If(t) - fix) I dt < / je dt = ihe < he 

If we divide by h, we see that (5.4) holds for 0 < /; < b. If /; < 0, a similar argument 
proves that (5.4) holds whenever 0 < |/j| < <5, and this completes the proof. 


5.2 The zero-derivative theorem 

If a functionfis constant on an open interval (a, b), its derivative is zero everywhere on 
(a, b). We proved this fact earlier as an immediate consequence of the definition of 
derivative. We also proved, as part (c) of Theorem 4.7, the converse of this statement 
which we restate here as a separate theorem. 
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theorem 5.2. zero-derivative theorem. If f(x) = 0 for each x in an open interval 

/, then f is constant on I. 


This theorem, when used in combination with the first fundamental theorem of calculus, 
leads to the second fundamental theorem which is described in the next section. 


5.3 Primitive functions and the second fundamental theorem of calculus 

definition OF PRIMITIVE function. A function P is called a primitive (or an antiderivative) 
of a function f on an open interval I if the derivative of P is f that is, if P ’(x) = f(x) for all 
x in /, 

For example, the sine function is a primitive of the cosine on every interval because the 
derivative of the sine is the cosine. We speak of a primitive, rather than the primitive, 
because if P is a primitive offthen so is P + k for every constant k. Conversely, any two 
primitives P and Q of the same function / can differ only by a constant because their 
difference P — Q has the derivative 

p \x) - Q’(x) = fix) - f(x) = 0 

for every x in / and hence, by Theorem 5.2, P — Q is constant on Z. 

The first fundamental theorem of calculus tells us that we can always construct a primitive 
of a continuous function by integration. When we combine this with the fact that two 
primitives of the same function can differ only by a constant, we obtain the second 
fundamental theorem of calculus. 

theorem 5.3. second fundamental theorem of calculus. Assume f is continuous 

on an open interval I, and let P be any primitive off on I. Then, for each c and each x in I, 
we have 

(5.7) P(x) = P(c) + [ /(t) dt . 

Proof. Let A(x) = J* fit) dt. Since /is continuous at each x in /, the first fundamental 
theorem tells us that A’(x) = fix) for all x in /. In other words, A is a primitive off on Z. 
Since two primitives off can differ only by a constant, we must have A(x) — P(x) = k 
for some constant k. When x = c, this formula implies -P(c) = k, since A(c) = 0. 
Therefore, A(x) — P(x) = -P(c), from which we obtain (5.7). 

Theorem 5.3 tells us how to find every primitive P of a continuous function/. We simply 
integrateffrom a fixed point c to an arbitrary point x and add the constant P(c) to get P (x). 
But the real power of the theorem becomes apparent when we write Equation (5.7) in the 
following form : 

(5.8) j>) dt = P(x) - P(c). 

In this form it tells us that we can compute the value of an integral by a mere subtraction 
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if we know a primitive P. The problem of evaluating an integral is transferred to another 
problem-that of finding a primitive P off. In actual practice, the second problem is a 
great deal easier to deal with than the first. Every differentiation formula, when read in 
reverse, gives us an example of a primitive of some functionfand this, in turn, leads to an 
integration formula for this function. 

From the differentiation formulas worked out thus far we can derive the following 
integration formulas as consequences of the second fundamental theorem. 

EXAMPLE 1. Integration of rational powers. The integration formula 


(5.9) 


x n dx 


Ji 


b n+ 1 - a n+1 

n + 1 


(n = 0, 1, 2, . . .) 


was proved in Section 1.23 directly from the definition of the integral. The result may be 
rederived and generalized to rational exponents by using the second fundamental theorem. 
First of all, we observe that the function P defined by the equation 


(5.10) P(x) =-i— 

n + 1 

has the derivative P'(x) = x n if n is any nonnegative integer. Since this is valid for all real 
x, we may use (5.8) to write 

f b b n+1 - u " +1 

x n dx = P(b) - P(a) = — 

n + 1 

for all intervals [a,b]. This formula, proved for all integers n > 0, also holds for all negative 
integers except n = — 1, which is excluded because n + 1 appears in the denominator. To 
prove (5.9) for negative n, it suffices to show that (5.10) implies P'(x) = x n when n is negative 
and ^ — 1, a fact which is easily verified by differentiating P as a rational function. Of 
course, when n is negative, neither P(x) nor P'(x) is defined for x = 0, and when we use 
(5.9) for negative n, it is important to exclude those intervals [a, b] that contain the point 
x = 0. 

The results of Example 3 in Section 4.5 enable us to extend (5.9) to all rational exponents 
(except —1), provided the integrand is defined everywhere on the interval [a, b] under 
consideration. For example, if 0 < a < b and n = — we find 


I 


'-Ldx 

X 


- 1/2 


X 


. 1/2 


dx = — 

1 

2 


2 (y'S-VS). 


This result was proved earlier, using the area axioms. The present proof makes no use of 
these axioms. 

In the next chapter we shall define a general power function f such that f(x) = x c for 
every real exponent C, We shall find that this function has the derivative f(x) = cx' : 1 and 
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the primitive P(x) = x c+l j(c + 1) if c ^ <— 1. This will enable us to extend (5.9) to all real 
exponents except —> 1. 

Note that we cannot get P'(x) = l lx by differentiation of any function of the form 
Pfx) = x n . Nevertheless, there exists a function P whose derivative is P'(x) = \jx. To 
exhibit such a function all we need to do is write a suitable indefinite integral; for example. 


PM = 



if X > 0 . 


This integral exists because the integrand is monotonic. The function so defined is called 
the Zogarithm (more specifically, the natural logarithm). Its properties are developed 
systematically in Chapter 6. 

example 2. Integration of the sine and cosine. Since the derivative of the sine is the 
cosine and the derivative of the cosine is minus the sine, the second fundamental theorem 
also gives us the following formulas: 


| cos x dx = sin x = sin b — sin a , 

J 3 3 

f b . b 

sin x dx = ( — cos x) = cos a — cos b . 
io 

These formulas were also proved in Chapter 2 directly from the definition of the integral. 

Further examples of integration formulas can be obtained from Examples 1 and 2 by 
taking finite sums of terms of the form Ax n , B sin x, C COS x, where A, B, C are constants. 


5.4 Properties of a function deduced, from properties of its derivative 

If a function / has a continuous derivative f ' on an open interval /, the second fundamental 
theorem states that 


(5.1D m = m + [ m dt 

for every choice of points x and c in I. This formula, which expresses / in terms of its 
derivative / ' , enables us to deduce pro perries of a function from properties of its derivative. 
Although the following properties have already been discussed in Chapter 4, it may be of 
interest to see how they can also be deduced as simple consequences of Equation (5.11). 

Suppose f is continuous and nonnegative on /, If x > c, then J* / ‘(t) dt > 0, and hence 
f(x) > f(c). In other words, if the derivative is continuous and nonnegative on /, the 
function is increasing on Z. 

In Theorem 2.9 we proved that the indefinite integral of an increasing function is convex. 
Therefore, iff is continuous and increasing on /, Equation (5.11) shows thatf is convex on 
/. Similarly, f is concave on those intervals where f is continuous and decreasing. 
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5.5 Exercises 

In each of Exercises 1 through 10, find a primitive off; that is, find a function P such that 
P’(x) = f(x) and use the second fundamental theorem to evaluate f '' fix) dx. 

1. f(x) = 5x 3 , 6. f(x) = V2x + V^x, x > 0. 

— + 7 

2. f(x) = 4x4 — 12x. 7. f(x) = - — 7 - , x > 0. 

2 Vx 

3. f(x) = (x + l)(x 3 - 2). 8, f(x) = 2x 1/3 — x ~ 1/3 , x > 0. 

-f" x - 3 

4. /(x) j x^0. 9. f(x) = 3 sin x + 2x 5 . 

5. f(x) = (1 + Vx) 2 , x > 0. 10. f(x) = x 4/3 - 5 COS x. 

1 1 . Prove that there is no polynomial / whose derivative is given by the formula f‘(x) = 1 jx. 

12. Show that \t\dt = |x)x| for all real x. 

13. Show that 

* 2x 2 

(t + I t}f dt - — (x + |x]) for all real x . 


14. A function / is continuous everywhere and satisfies the equation 

f(t) dt - + x 2 + x sin 2x + | cos 2x 

for all x. Compute fiff) and f'(l-n). 

15. Find a function/ and a value of the constant c such that 

\ x f (t) dt = cos x — | for all real x . 

16. Find a function/ and a value of the constant c such that 

J* tf(t) dt - sin x - x cos x - |x 2 for all real x . 


17. There is a function f, defined and continuous for all real x, which satisfies an equation of the 
form 


[ X f(t) dt = P 

Jo Jx 


t 2 /0)dt +- +j+c 


where c is a constant. Find an explicit formula for / (x) and find the value of the constant c. 

18. A functionf is defined for all real x by the formula 


„ C 1 + sin / 


dt 


Without attempting to evaluate this integral, find a quadratic polynomialp(x) = a + bx + cx 2 
such that p{ 0) = f(0), p’(O) = /'(0), and p"( 0) = f"( 0). 
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19. Given a function g, continuous everywhere, such that g( 1) = 5 and g(t) dt = 2. Let f(x) = 
\ (x — t) 2 git) dt. Prove that 


rx px 

fix) = x J o git) dt -J o tg(t) dt , 


then compute f"( 1) and f'% 1). 

20. Without attempting to evaluate the following indefinite integrals, find the derivativef’(x) in 
each case if f(x) is equal to 

(a) (1 + tV dt , (b) J’ (1 + t 2 )~ 3 dt , (c) J“ (1 + rV dt . 

21. Without attempting to evaluate the integral, compute^'^if/is defined by the formula 


f(x) = 



1 + ? 4 


dt . 


22. In each case, compute f(2) if fis continuous and satisfies the given formula for allx > 0. 

(a) f(t) dt = x 2 (l + x) . (c) t 2 dt = x\\ + x) . 

f X ^ TiK^(l+T) 

(b) f(t) dt = x 2 (\ + x) . (d) f(t) dt = x . 

0 JO 

23. The base of a solid is the ordinate set of a nonnegative functionf’over the interval [0, a]. All 
cross sections perpendicular to this interval are squares. The volume of the solid is 


a 3 — 2a COS a + (2 — a 2 ) sin a 


for every a >0. Assume fis continuous on [0, a] and calculate ^(a). 

24. A mechanism propels a particle along a straight line. It is designed so that the displacement 
of the particle at time t from an initial point 0 on the line is given by the formula f{f) = \t 2 + 
2t sin t. The mechanism works perfectly until time t = tt when an unexpected malfunction 
occurs. From then on the particle moves with constant velocity (the velocity it acquires at 
time t = tt). Compute the following: (a) its velocity at time t - tt\ (b) its acceleration at 
time t = r; (c) its acceleration at time t = f tt', (d) its displacement from 0 at time t = fir. 
(e) Find a time t > n when the particle returns to the initial point 0, or else prove that it never 
returns to 0. 

25. A particle moves along a straight line. Its position at time t is f(t). When 0 < t < 1, the 
position is given by the integral 


fit) = 


f ( 1 + 2 sin xx COS irx 
1 + X 2 


dx. 


(Do not attempt to evaluate this integral.) For t > 1, the particle moves with constant 
acceleration (the acceleration it acquires at time t = 1). Compute the following: (a) its acceler- 
ation at time t = 2; (b) its velocity when / = 1; (c) its velocity when t > 1; (d) the difference 
f(t) -f(l) when t > 1. 

26. In each case, find a function / with a continuous second derivativef” which satisfies all the 
given conditions or else explain why such an example cannot exist. 

(a) fix) > 0 for every x, f’(0) = 1, f’(l) = 0. 

(b) f ix) > 0 for every x, f (0) = 1, f( 1) = 3. 

(c) fix) > 0 for every x, f’(0) = 1, f(x) < 100 for all x > 0. 

(d) fix ) >0 for every x, f\ 0) = 1, f(x) < 100 for all x < 0. 
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27. A particle moves along a straight line, its position at time t being f(t). It starts with an initial 
velocity f'(0) = 0 and has a continuous acceleration f"(t) > 6 for all t in the interval 0 <t < I . 
Prove that the velocity f'(t) > 3 for all t in some interval [a, b], where 0 < a < b <. 1, with 

b - a = + . 

28. Given a functionfsuch that the integral A(x) = J*/(f) dt exists for each x in an interval [a, b]. 
Let c be a point in the open interval (a, b). Consider the following ten statements about this 

f andthis A: 


(a) 

(b) 

(c) 

(d) 

(e) 


fi S 

fi s 

fi s 

f'(c) exists. 

/' is continuous 


continuous at c. 
discontinuous at c. 
increasing on (a, b). 

at c. 


(а) A is continuous at c. 

(/?) A is discontinuous at c. 
(y) A is convex on (a, b). 

(б) A’(c) exists. 

(e) A’ is continuous at c. 


In a table like the one shown here, mark T in 
the appropriate square if the statement labeled 
with a Latin letter always implies the statement 
labeled with a Greek letter. Leave the other 
squares blank. For example, if (a) implies (a), 
mark T in the upper left-hand comer square, etc. 



a 

fi 

Y 

3 

€ 

a 






b 






c 

d 

- 

- 

- 

- 

- 

e 







5.6 The Leibniz notation for primitives 

We return now to a further study of the relationship between integration and differentia- 
tion. First we discuss some notation introduced by Leibniz. 

We have defined a primitive P of a functionfto be any function for which P’(x) = fix). 
Iff is continuous on an interval, one primitive is given by a formula of the form 


P(x) = jj(t) dt , 

and all other primitives can differ from this one only by a constant. Leibniz used the 
symbol jf{x) dx to denote a general primitive off. In this notation, an equation like 

(5.12) | f(x) dx = P(x) + c 

is considered to be merely an alternative way of writing P’(x) = f(x). For example, since 
the derivative of the sine is the cosine, we may write 

(5.13) | cos x dx = sin x + C . 

Similarly, since the derivative of x” +1 /(« + 1) is x n , we may write 

1 Y n + 1 

x n dx = — — + C , 

n + 1 


(5.14) 
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for any rational power n ^ — 1. The symbol C represents an arbitrary constant so each 
of Equations (5.13) and (5.14) is really a statement about a whole set of functions. 

Despite similarity in appearance, the symbol §f(x) dx is conceptually distinct from 
the integration symbol f(x) dx. The symbols originate from two entirely different 
processes-differentiation and integration. Since, however, the two processes are related 
by the fundamental theorems of calculus, there are corresponding relationships between 
the two symbols. 

The first fundamental theorem states that any indefinite integral off is also a primitive 
off. Therefore we may replace P(x) in Equation (5.12) by J* fit) dt for some lower limit 
c and write (5.12) as follows: 

(5.15) J/(x) dx = (7(0 dt + C, 

This means that we can think of the symbol §f(x) dx as representing some indefinite 
integral off, plus a constant. 

The second fundamental theorem tells us that for any primitive P off and for any constant 
C, we have 

£ f(x) dx = [P(x) + C] £ 

If we replace P(x) + C by jf(x) dx, this formula may be written in the form 

(5.16) £ f(x) dx = | f(x) dx £ 

The two formulas in (5.15) and (5.16) may be thought of as symbolic expressions of the 
first and second fundamental theorems of calculus. 

Because of long historical usage, many calculus textbooks refer to the symbol ff(x) dx 
as an “indefinite integral” rather than as a primitive or an antiderivative. This is justified, 
in part, by Equation (5.15), which tells us that the symbol ff(x) dx is, apart from an 
additive constant C, an indefinite integral off. For the same reason, many handbooks of 
mathematical tables contain extensive lists of formulas labeled “tables of indefinite 
integrals” which, in reality, are tables of primitives. To distinguish the symbol J/(x) dx 
from JJj fix ) dx, the latter is called a definite integral. Since the second fundamental theorem 
reduces the problem of integration to that of finding a primitive, the term “technique of 
integration” is used to refer to any systematic method for finding primitives. This termi- 
nology is widely used in the mathematical literature, and it will be adopted also in this 
book. Thus, for example, when one is asked to “integrate” J/ (x) dx, it is to be understood 
that what is wanted is the most general primitive off. 

There are three principal techniques that are used to construct tables of indefinite 
integrals, and they should be learned by anyone who desires a good working knowledge 
of calculus. They are (1) integration by substitution (to be described in the next section), 
a method based on the chain rule; (2) integration by parts, a method based on the formula 
for differentiating a product (to be described in Section 5.9); and (3) integration bypartial 
fractions, an algebraic technique which is discussed at the end of Chapter 6. These 
techniques not only explain how tables of indefinite integrals are constructed, but also 
they tell us how certain formulas are converted to the basic forms listed in the tables. 
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5.7 Integration by substitution 

Let Q be a composition of two functions P and g, say Q(x) = P[g(x)] for all x in some 
intervalZ. If we know the derivative of P, say P'(x) = fix), the chain rule tells us that the 
derivative of Q is given by the formula Q’(x) = P'[g(x)]g'(x). Since P' =f this states 
that Q’(x) =f[g(x)]g'(x). In other words, 

(5.17) P\x) = fix) implies Q’(x) = f[g{x)]g\x) ■ 

In Leibniz notation, this statement can be written as follows: If we have the integration 
formula 

(5.18) j f(x) dx = P(x) + C , 
then we also have the more general formula 

(5-19) jMx)]g’(x)dx = P[g(x)] + C. 

For example, if f{x) = COS x, then (5.18) holds with P(x) = sin x, so (5.19) becomes 

(5.20) J cos g(x) . g’(x) dx = sin g(x) + C . 

In particular, if g(x) = x 3 , this gives us 


cos x 3 ■ 3x 2 dx = sin x 3 + C , 


a result that is easily verified directly since the derivative of sin x 3 is 3x 2 cos x 3 . 

Now we notice that the general formula in (5.19) is related to (5.18) by a simple mechanical 
process. Suppose we replace g(x) everywhere in (5.19) by a new symbol u and replace g’(x) 
by du/dx, the Leibniz notation for derivatives. Then (5.19) becomes 

f(u) — dx = P(u) + C . 

J dx 


At this stage the temptation is strong to replace the combination — dx by du. If we do 
this, the last formula becomes x 

(5.21) ff(u)du = P(u) + c . 

Notice that this has exactly the same form as (5.18), except that the symbol u appears 
everywhere instead of x. In other words, every integration formula such as (5.18) can be 
made to yield a more general integration formula if we simply substitute symbols. We 
replace x in (5.18) by a new symbol u to obtain (5.21), and then we think of u as representing 
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a new function of x, say u = g(x). Then we replace the symbol du by the combination 
g’(x) dx, and Equation (5.21) reduces to the general formula in (5.19). 

For example, if we replace x by u in the formula j' COS x dx = sin x + C, we obtain 

| cos u du = sin u + C . 

In this latter formula, u may be replaced by g(x) and du by g’(x) dx, and a correct integration 
formula, (5.20), results. 

When this mechanical process is used in reverse) it becomes the method of integration by 
substitution. The object of the method is to transform an integral with a complicated 
integrand, such as j 3x 2 COS x 3 dx, into a more familiar integral, such as j COS u du. The 
method is applicable whenever the original integral can be written in the form 


j/fe(*)]g'(*) dx , 

since the substitution 

u = g(x), du = g'(x) dx , 

transforms this to §f(u)d u. If we succeed in carrying out the integration indicated by 
f/(«) du, we obtain a primitive, say P(u), and then the original integral may be evaluated 
by replacing u by g(x) in the formula for P(u). 

The reader should realize that we have attached no meanings to the symbols dx and du 
by themselves. They are used as purely formal devices to help us perform the mathematical 
operations in a mechanical way. Each time we use the process, we are really applying the 
statement (5.17). 

Success in this method depends on one’s ability to determine at the outset which part of 
the integrand should be replaced by the symbol u, and this ability cornes from a lot of 
experience in working out specific examples. The following examples illustrate how the 
method is carried out in actual practice. 

EXAMPLE 1 . Integrate J x 3 COS x 4 dx. 

Solution. Let us keep in mind that we are trying to write x 3 COS x 4 in the form/ [g(x)]g'(x) 
with a suitable choice off and g. Since cos X 4 is a composition, this suggests that we take 
f(x) = COS x and g(x) = x 4 so that COS x 4 becomes f [g(x)]. This choice of g gives g’(x) = 
4x 3 , and hence f[g(x)]g’(x) = ( COS x 4 ) (4x 3 ). The extra factor 4 is easily taken care of 
by multiplying and dividing the integrand by 4. Thus we have 

x 3 C0Sx 4 = J(cos x 4 )(4x 3 ) = i/[g(x)]g'(x) ■ 

Now, we make the substitution u — g(x) = x 4 , du = g’(x) dx = 4x 3 dx, and obtain 


| x s cos x 4 dx = l J f(u) du = J J cos u du = J sin u + C. 
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Replacing U by x 4 in the end result, we obtain the formula 

J* x 3 cos x 4 dx = j sin x 4 + C , 

which can be verified directly by differentiation. 

After a little practice one can perform some of the above steps mentally, and the entire 
calculation can be given more briefly as follows: Let u = x 4 . Then du = 4x 3 dx, and we 
obtain 


x 3 cos x* dx = \ | (cos x 4 )(4x 3 dx) = i cos u 


du = 4 sin u + C = j sin x 4 C 


Notice that the method works in this example because the factor x 3 has an exponent one 
less than the power of x which appears in cos x 4 . 

example 2. Integrate J cos 2 x sin x dx. 

Solution. Let u = cos X. Then du = -sin X dx, and we get 


/* 

*/ 


cos 2 x sin X dx = 


(cos x) (—sin x dx) = — | u 2 du = — — + C — 


cos 3 x 


+ C 


Again, the final result is easily verified by differentiation. 
f sin Vx 


example 3. Integrate 


J vG 


dx . 


Solution. Let U = Vx = x 1 / 2 . Then du = tx~L 2 dx, or dxjVx = 2 du. Hence 


J ^ J X = 2^ sin „ 


Vx 

EXAMPLE 4. Integrate 


du — -2 cos u + C — -2 cos v x + C . 


xdx 


Vl + X 2 

Solution. Let u = 1 + x 2 . Then du =2x dx so x dx = l du, and we obtain 
f x d x __ 1 f du _ 1 f u _ m du = 1/a + c _ yj— 1 + c a 

JVTTv 2 J Vu 2 J V1 + x 


The method of substitution is, of course, also applicable to definite integrals. For example, 
to evaluate the definite integral j^ 2 cos 2 x sin X dx, we first determine the indefinite integral. 
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as explained in Example 2, and then we use the second fundamental theorem to write 


ir/2 


1 


cos 3 x sin x dx = — cos 3 x 

10 3 


jt/2 

0 


-(cos 3 — — COS 3 o) 1 
3 \ 2 


Sometimes it is desirable to apply the second fundamental theorem directly to the integral 
expressed in terms of u. This may be done by using new limits of integration. We shall 
illustrate how this is carried out in a particular example, and then we shall justify the 
process with a general theorem. 


EXAMPLE 5. Evaluate 


L 


(x + 1) dx 


2 Vx 2 + 2x + 3 . 


Solution. Let u = x 2 + 2x + 3. Then du = (2x + 2) dx, so that 

(X + 1) dx _ 1 du 
V? + 2x + 3 2 Vu 

Now we obtain new limits of integration by noting that u = 11 when x = 2, and that 
u = 18 when x = 3. Thenwewrite 


(* + 1 ) dx _ 1 


- 1/2 


du = Vu = VT8 - vTI 


12 V* 2 + 2x + 3 2 J 11 

The same result is arrived at by expressing everything in terms of x. Thus we have 


(x + 1) dx \/~ 2 ~ 


2x 3 


= V^8 - Vli . 


2 V x 2 + 2x + 3 - 

Now we prove a general theorem which justifies the process used in Example 5. 


THEOREM 5.4. SUBSTITUTION THEOREM FOR INTEGRALS. Assume g IlClS Cl COntinUOUS 

derivative g on an open interval I- Let J be the set of values taken by g on I and assume that 
fis continuous on J. Then for each x and c in /, we have 

f[g(t)]g'(t) dt = J /(«) du . 

-C dg[c) 

Proof. Let a = g(c) and define two new functions P and Q as follows: 


P(x) = f /(«) du if x e J, Q(x) = f /[g(f)]g'(0 dt if x e / . 

Jn J C 
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Since P and Q are indefinite integrals of continuous functions, they have derivatives given 
by the formulas 

P'(x)= f{x), Q'(x)=f[g(x)]g'(x). 

Now let R denote the composite function, R(x) = P[g(x)]. Using the chain rule, we find 
R'(x) = P'[g(x))g'{x) = f[g(x)]g'(x) = Q'(x) . 

Applying the second fundamental theorem twice, we obtain 
rs( x) rglx) 

f(u) du = P’(u) dll = P[g(x)] - P[g(c) ] = R(x) - R(c), 

J g(c) ^g(c) 

and 

f/[g(0]g'(0 dt = Q’(t) dt = r R’(t) dt = R(x) - R(c) . 

•>C * c 

This shows that the two integrals in (5.22) are equal. 


5.8 Exercises 

In Exercises 1 through 20, evaluate the integrals by the method of substitution. 

sin x dx 


1. 


V 


2x + 1 dx. 


I xVT 


+ 3x dx. 


. jx 2 V x + 1 dx. 

1/3 xdx 

2/3 \/ 2 — 3 jc 


P' 

I. J—2 

r (x + 1 )dx 
' J (x 2 + 2 x + 2) 3 

. |sin 3 x</x. 

. jz(z - 1) 1/3 dz 
f cos x dx 

8 J sl^T 7 " 

9. J* /4 C0S 2xV4~ 
C sin x 

10 - J (3 +C( 


sin 2x dx. 


dx 

cos xf ‘ 


11 


12 . 


V cos 3 X 
sin \/ x + \ dx 


■ f- 

J V 

. r , 

J 3 VX + 1 

13. Jx n_1 sin x n dx, 

f x 5 dx 

14. 

J V i — * 6 

15. |f(l + t) l/i dt. 

. J(x 2 +1 )~ s,2 dx. 

. J ,y 2 (8x 3 + 27) 2/3 dx , 
cos x) dx 


n ^ 0 . 


16. 


17. 


cos x) 1/3 ' 
x dx 


f (sin x + c 
J (sin x — 

SvT+x 2 + V(i + x^ 

r (x 2 + 1 - 2x) 1/5 dx 

20. / , 

J 1-x 


18. 

19. 
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21. Deduce the formulas in Theorems 1.18 and 1.19 by the method of substitution. 

22. Let 


F(x, a) 


t f 


(I 2 + a 2 )' 


dt. 


where a > 0, and p and q are positive integers. Show that F(x, a) = a 1 ^ 1 29 F(xla, 1). 

23. Show that 


f 1 dt f 1 /* dt 


24. If m and n are positive integers, show that 

,r m (l — x) n dx = J* x n (l — x)” 


dx 


25. If m is a positive integer, show that 

jJ /2 cos m x sin”‘x dx = 2~ m [ o W2 cos" xdx. 


26. (a) Show that 


v it r* 

xf( sin x) dx = - /(sin x) dx ■ [Hint: u = tr - x,] 
o 

(b) Use part (a) to deduce the formula 

x sin x j p dx 
0 1 + COS 2 X x ~ n J 0 1 + X 2 ' 

27. Show that jj (1 — x 2 )" -1/2 dx = Jj /2 cos 2 " u du if n is a positive integer. [Hint: X = sin u.] 
The integral on the right can be evaluated by the method of integration by parts, to be discussed 
in the next section. 


5.9 Integration by parts 

We proved in Chapter 4 that the derivative of a product of two functions / and g is given 
by the formula 

h'(x) = f{x)g'(x) +f'(x)g(x) , 

where h(x) = f(x)g(x). When this is translated into the Leibniz notation for primitives, it 
becomes J f(x)g ‘(x) dx + j f( x )g( x ) dx = f(x)g(x) + C, usually written as follows : 


(5.23) 


J7(x)g'(x) dx = f(x)g(x) - jf(x)g(x) dx + C . 
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This equation, known as the formula for integration by parts, provides us with a new 
integration technique. 

To evaluate an integral, say J k(x) dx, using (5.23), we try to find two functions / and g 
such that k(x) can be written in the form f(x)g'(x). If we can do this, then (5.23) tells us 
that we have 

j k(x) dx = /(x)g(x) - j g(x)f'(x) dx + C , 

and the difficulty has been transferred to the evaluation of J g{x)f\x) dx. If f and g are 
properly chosen, this last integral may be easier to evaluate than the original one. Some- 
times two or more applications of (5.23) will lead to an integral that is easily evaluated 
or that may be found in a table. The examples worked out below have been chosen to 
illustrate the advantages of this method. For definite integrals, (5.23) leads to the formula 

P/(x)g'(*) dx = f(b)g(b ) -/(a)g(o) - \ b f'(x)g(x) dx . 

J a da 


If we introduce the substitutions u = fix), v = g(x), du = f'{x) dx, and dv = g’OO dx, 
the formula for integration by parts assumes an abbreviated form that many people find 
easier to remember, namely 


(5.24) j U dv= uv — J vdu + C. 

EXAMPLE 1 . Integrate J x cos x dx. 

Solution. We choose f(x) = X and g’(x) = cos x. This means that we have f’ (x) = 1 
and g(x) = sin x, so (5.23) becomes 


(5.25) J X COS X dx = X sin X — J sin xdx + C = x sin x + cos x + C . 

Note that in this case the second integral is one we have already calculated. 

To carry out the same calculation in the abbreviated notation of (5.24) we write 

u = x, dv= cos x dx , 

du = dx, v = j cos x dx = sin x , 

jxc°sxdx = uv — J v du = X sin x — J sin x dx + C = X sin X+ cos x + C . 

Had we chosen u = cos x and dv = x dx, we would have obtained du = — sin x dx, 
v = |x 2 , and (5.24) would have given us 


X COS xdx= ijpc 2 cos X — J? 




-bill X) UA 
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Since the last integral is one which we have not yet calculated, this choice of u and v is not 
as useful as the first choice. Notice, however, that we can solve this last equation for 
J X 2 sin x dx and use (5.25) to obtain 

J x 2 sin x dx = 2x sin x + 2 cos x — x 2 COS x + C . 
example 2. Integrate J x 2 cos x dx. 

Solution. Let u= x 2 and du = cos x dx. Then du = 2x dx and v = J cos x dx = sin x, 

SO we have 


(5.26) J X 2 COS x dx = J u du = uv — J v du + C = x 2 sin x — 2 J x sin x dx + C . 

The last integral can be evaluated by applying integration by parts once more. Since it is 
similar to Example 1, we simply state the result: 


J x sin x dx = —x COS x + sin x + C . 

Substituting in (5.26) and consolidating the two arbitrary constants into one, we obtain 

I X 2 COS X dx = x 2 sinx+ 2xcosx — 2sinx + C. 

example 3. The method sometimes fails because it leads back to the original integral. 
For example, let us try to integrate J x -1 dx by parts. If we let u = x and dv = x 2 dx, 
then J x _1 dx = J u dv. For this choice of u and v, we have du = dx and v = — x _1 , SO 
(5.24) gives us 

(5.27) J x _1 dx = j u dv= uv J v du + C = — 1 + J x ~ 1 dx + C , 

and we are back where we started. Moreover, the situation does not improve if we try 
u = x" and dv = x - ” -1 dx. 

This example is often used to illustrate the importance of paying attention to the arbitrary 
constant C. If formula (5.27) is written without C, it leads to the equation f x -1 dx = 
— 1 + J x _1 dx, which is sometimes used to give a fallacious proof that 0 = 1. 

As an application of the method of integration by parts, we obtain another version of 
the weighted mean-value theorem for integrals (Theorem 3.16). 


THEOREM 5.5. SECOND MEAN-VALUE THEOREM FOR INTEGRAL— . AsSUIfie gisCOlltiTlUOUSOn 

[a, b], and assume f has a derivative which is continuous and never changes sign in [a, b]. 
Then, for some c in [a, b], we have 

j a f(x)g(x) dx =f(a ) _[ g(x) dx +f(b ) £ g(x) dx . 


(5.28) 
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Proof. Let G(x) = j%g(t)dt. Since g is continuous, we have G’(x) = g(x). Therefore, 
integration by parts gives us 

(5.29) f b f(x)g(x) dx = ) f(x)G'(x) dx =f(b)G(b) — f f(x)G(x)dx, 

J a J a J a 

since G(a) = 0. By the weighted mean-value theorem, we have 

f f(x)G(x) dx = G(c) f f'(x ) dx = G(c)[f(b) - /(a)] 

J a J a 

for some c in [a, b], Therefore, (5.29) becomes 

f /(x)g(x) dx = f(b)G(b) - G(c)lf(b) - m ] = /(a)G(c) + f{b)[G(b) - G(c)] , 

This proves (5.28) since G(c) = j c a g(x) dx and G(b) — G(c) = g(x) dx . 


5.10 Exercises 

Use integration by parts to evaluate the integrals in Exercises 1 through 6. 

1 . jxsinxdx. 4. jx 3 sinxdx. 

2. jx 2 sin x dx. 5. J sin x cos x dx, 

3. Jx 3 cos x dx, 6. jx sin x cos x dx, 

7. Use integration by parts to deduce the formula 

j'sin 2 x dx = —sin x cos x + J cos 2 x dx . 

In the second integral, write COS 2 x = 1 — sin 2 x and thereby deduce the formula 

| sin 2 x dx = \x — | sin 2x . 

8. Use integration by parts to deduce the formula 

J'sin" x dx = — sin n-1 x cos X+ (n — 1) Jsin" _2 x cos 2 x dx . 

In the second integral, write cos 2 x = 1 — sin 2 x and thereby deduce the recursion formula 

sin" -1 x cos x n — 1 


/■ 


sin" x dx = - 


n 


+ • 


sin" -2 x dx . 


9. Use the results of Exercises 7 and 8 to show that 

/V» n 

(a) sin 2 x dx = - . 

Jo 4 
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f ”l 2 3 ir/2 377- 

(b) Jo sin 4 x dx = '4 , ( sin 2 x dx = 46 ' 

fr/2 5 f'/ 8 5 n 

(c) J sin 6 x dx = £ J e sin 4 x dx = — 

10. Use the results of Exercises 7 and 8 to derive the following formulas. 

(a) J sin 3 xdx = cos x + r 2 - cos 3x. 

(b) j sin 4 x dx = | x — \ sin 2x + ^>- sin 4x. 

(c) j sin 5 xdx = -§x + -£g cos 3x — fo cos 5x. 

11. Use integration by parts and the results of Exercises 7 and 10 to deduce the following formulas. 

(a) jx sin 2 x dx = l x 2 - 3 * sin 2x - | cos 2x. 

(b) jx sin 3 x dx - | sin x - - 3 \ sin 3x - \x cos x + -fa x cos hx. 

(c) jx 2 sin 2 x dx s= ^x 3 + (s — |x 2 ) sin 2x — }x cos 2x. 

12. Use integration by parts to derive the recursion formula 


* 


cos" x dx — 


+ 


cos" 2 x dx . 


13. Use the result of Exercise 12 to obtain the following formulas. 

(a) Jcos 2 x dx - \x + l sin 2x. 

(b) Jcos 3 x dx = | sin x + -nr sin 3x. 

(c) jcos 4 x dx = fx + l sin 2x + fa sin 4x. 

14. Use integration by parts to show that 


i dx. 


jv i - x 2 dx = xV i - x 2 +jy/Yzr 

Write x a = X 2 — 1 + 1 in the second integral and deduce the formula 


JV-nr 


! dx = AxV 1 — x 2 + A 




dx. 


15. (a) Use integration by parts to derive the formula 


J (a 2 — x 2 )" dx = 


x(a 2 — x 2 ) n 2 a 2 n 

2n + 1 2 n + 1 


(a 2 — x 2 ) n 1 dx + C . 


(b) Use part (a) to evaluate Jg (a 2 — x 2 f n dx. 
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16. (a) If I n (x ) = JS t n (t 2 + a 2)— 1 /2 use integration by parts to show that 

nl n (x) = x n ~W x 2 + a 2 — (n — l)a 2 /„__ 2 (x) if n >2 . 

(b) Use part (a) to show that jjj x h (x 2 + 5) _1/2 dx = 168/5 — 40\/J/3. 

17. Evaluate the integral Jij r 3 (4 + t 3 ) _1/2 dt, given that j 3 j (4 + r 3 ) 1/2 dt = 11.35. Leave the 
answer in terms of \/3 and V^T. 

18. Use integration by parts to derive the formula 

f sin n+1 x d x _ 1 sin” x n ' sin” -1 x ^ 

J cos m+1 x m cos™ x m J cos™ -1 x 


Apply the formula to integrate j tan 2 x dx and )' tan 4 x dx. 

19. Use integration by parts to derive the formula 

f cos ra+1 x 1 cos™ x m C COS™' 1 X 

— — — — dx = — -rm — dx ■ 

J sm n+1 X n sin” x nj sin "' 1 X 

Apply the formula to integrate J cot 2 X dx and J cot 4 X dx. 

20. (a) Find an integer n such that n j'J xf"( 2x) dx = J/j ff"(t) dt. 

(b) Compute |J xf"{ 2x) dx, given that f(0) = 1, f(2) = 3, and f (2) = 5. 

21. (a) If (f>" is continuous an d nonzero on [a, b], and if there is a constant m > 0 such that 
<fi'(t) > m for all t in [a, b], use Theorem 5.5 to prove that 


*6 

Ja 


sin <j>{t) dt 



[Hint: Multiply and divide the integrand by <j>'(t).] 
(b) If a > 0, show that | J* sin (t 2 ) dt\ < 2/a for all X > a. 


* 5.11 Miscellaneous review exercises 

1. Let f be a polynomial withf(O) = 1 and let g(x) = PJ’(x). Compute g(O), g’(O), . . . , £ (n) (0), 

2. Find a polynomial P of degree < 5 with P(0) = 1, jP(1) = 2, P'(O) = P”(0) = P’(l ) = P'(l) = 0. 

3. If f(x) = cos X and g(x) = sin X, prove that 

f M (x) = COS (x and g {n \x) = sin (x + \nTr). 

4. If h(x) = f(x)g(x), prove that the nth derivative of /; is given by the formula 


h {n) (x) = 2(”)/ <ft) (AV" k \x ) , 

where Q) denotes the binomial coefficient. This is called Leibniz’s formula. 

5. Given two functions f and g whose derivativesf and g’ satisfy the equations 

(5.30) f{x) = g{x) , g’(x) = ~f(x) , /( 0 ) = 0 , £( 0 ) = 1 , 

for every x in some open interval J containing 0. For example, these equations are satisfied 
when f (x) = sin x and g(x) = cos x. 




Miscellaneous review exercises 


223 


(a) Prove that f 2 (x) + g\x) = 1 for every x in J. 

(b) Let F and G be another pair of functions satisfying (5.30). Prove that F(x) = fix) and 
G(x) = g(x) for every x in J. [Hint: Consider h(x) = [F(x) — fix)] 2 + [G (x) — gix)] 2 .] 

(c) What more can you say about functionsfand g satisfying (5.30)? 

6. A function f defined for all positive real numbers, satisfies the equation fix 2 ) = x s for every 
x > 0. Determine!’’ (4). 

7. A function g, defined for all positive real numbers, satisfies the following two conditions: 
g(l) = 1 and g'ix 2 ) = yf for all x > 0. Compute g(4). 

8. Show that 

’* sin t 

— — rdt > 0 for all x > 0. 

Jo ‘ + 1 

9. Let C 1 and C 2 be two curves passing through the origin as indicated in Figure 5.2. A curve 
C is said to “bisect in area” the region between Cj and C 2 if. for each point P of C, the two 
shaded regions A and B shown in the figure have equal areas. Determine the upper curve 
given that the bisecting curve C has the equation y = x 2 and that the lower curve Cj has the 
equation y = \x 2 . 



Figure 5.2 Exercise 9. 


10. A functionfis defined for all x as follows: 

lx 2 if x is rational , 


m = 


0 if x is irrational . 


Let Q(h) = f (h)lh if 0. (a) Prove that Q(h) ->■ 0 as A -> 0. (b) Prove that f has a derivative 

at 0, and compute /'(0). 

In Exercises 11 through 20, evaluate the given integrals. Try to simplify the calculations by 
using the method of substitution and/or integration by parts whenever possible. 


. [ (2 + 3x) sin 5x dx. 
. jxV 1 + x 2 dx. 

13. £*(* 2 -1 fdx. 


11 

12 . 


16 

17 


14. 


1 2x + 3 


dx. 


0 (6x + 7 ) 3 
15. [x 4 (l + x>f dx 


. J V(l - x) 2l> dx. 
r 2 , i 

x 2 sin — dx. 

J i x 

18. (" sin i/x — 1 dx. 

19. Jx sin x 2 cos x 2 dx. 

20. | vM + 3 cos 2 x sin 2x dx. 
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21. Show that the value of the integral 375x 5 (x 2 + 1)~ 4 dx is 2” for some integer n. 

22. Determine a pair of numbers a and b for which JJ (ax + b)(x 2 + 3x + 2) 2 dx = 3/2. 

23 . Let I n = JJ(1 — x 2 ) n dx. Show that (2n + l)/ n = 2n I n _ lr then use this relation to compute 

4 4 4 anc * 4 

24. Let F(m, n) = j'J t m { 1 + t) n dt, m> 0, n > 0. Show that 


(m + 1 )F(m, n) + nF(m + 1, n — 1) = x m+1 ( 1 + x ) n . 

Use this to evaluate F(io, 2). 

25 . Let /(«)= J' /4 tan* x dx where n > 1. Show that 

(a) f(n + 1) < fin). 

(b) fin) + f{n - 2) = if n > 2 . 

( c > ^rr < 2 / ( «)< if n>2 ' 

26. Compute f (0 ) , given that fib) = 2 and that Jq[/W + f"ix ) ] sin x dx = 5. 

27. Let A denote the value of the integral 

f COS x , 
lix+2f‘* x - 

Compute the following integral in terms of A : 


' nl2 si 

Jc 


sin x cos x 

* + 1 


-dx . 


The formulas in Exercises 28 through 33 appear in integral tables. Verify each of these formulas 
by any method. 


28 

29 . 

JO. 

31 


f V a + bx , / — f dx 

■ )~-^ dx=1 ^ Tx+a )^rm +c - 


x n V ax + b dx = 


2 


a(2n + 3 ) 
2 


x n (ax + bf>' 2 — nb 


(2m + 1)6 


Va + bx 

dx _ V ax + b (2 n - 3 )a 


x n ~W ax + b dxj + C (n jt -|). 

f r x m ~ 1 \ 

x m V a + bx — ma - — dx ) + C (mb —A). 


+ b - (n — l)bx n 1 (2 n —2)bJ x n ~ 1 Vax~Fb + 


C in b 1 ). 


32 


33 


im — n) sin” x x m 


— If cos™ -2 x 

— nj sin" x 


dx + C (m b «), 


J? 

fCOS" X 

J sin”x 

f cos m x cos m+1 x m — n + 2 C cos™ x 

J sin” x X Qt — 1) sin” -1 x n — I J sin” -2 x dx + C ( n ^ 4 

34. (a) Find a polynomial P(x) such that P’(x) - 3 P(x) = 4 - 5 X + 3x 2 . Prove that there is 
only one solution. 
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(b) If Q(x) is a given polynomial, prove that there is one and only one polynomial P(x) such 
that P’(x) - 3 P(x) = Q(x). 

35. A sequence of polynomials (called the Bernoullipolynomials) is defined inductively as follows: 
P,(x) = 1; P„( x ) = nP n __i(x) and dx = 0 if it > 1. 

(a) Determine explicit formulas for P^x), P 2 (x), ■ ■ ■ , P/x). 

(b) Prove, by induction, that P n (x) is a polynomial in x of degree n, the term of highest degree 
being x n . 

(c) Prove that P n ( 0) = P n (}) if n > 2. 

(d) Prove that P n (x + 1) - P n (x) = nx n ~ x if n> 1. 

(e) Prove that for n > 2 we have 


I-L'- 


(x) dx 


Pn + #) ~ Pn+m 

n + 1 


(f) Prove that P n ( 1 — x) = ( — 1 ) n P n (x) if n > 1. 

(g) Prove that P 2n+1 ( 0) = 0 and P 2r ,_ 1 (l) - 0 if n> 1. 

36. Assume that |/"(*)l < m for each x in the interval [0, a], and assume thatftakes on its largest 
value at an interior point of this interval. Show that |/'(0)| + \f'(a)\ < am. You may assume 
thatf ' is continuous in [0, a]. 
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THE LOGARITHM, THE EXPONENTIAL, 

AND THE INVERSE TRIGONOMETRIC FUNCTIONS 


6.1 Introduction 

Whenever man focuses his attention on quantitative relationships, he is either studying 
the properties of a known function or trying to discover the properties of an unknown 
function. The function concept is so broad and so general that it is not surprising to find 
an endless variety of functions occurring in nature. What is surprising is that a few rather 
special functions govern so many totally different kinds of natural phenomena. We shall 
study some of these functions in this chapter-first of all, the logarithm and its inverse 
(the exponential function) and secondly, the inverses of the trigonometric functions. Any- 
one who studies mathematics, either as an abstract discipline or as a tool for some other 
scientific field, will find that a good working knowledge of these functions and their prop- 
erties is indispensable. 

The reader probably has had occasion to work with logarithms to the base 10 in an 
elementary algebra or trigonometry course. The definition usually given in elementary 
algebra is this: If x > 0, the logarithm of x to the base 10, denoted by log 10 x, is that 
real number u such that 10“ = x. If x = 10“ and y = lO”, the law of exponents yields 
xy = 10“+“, In terms of logarithms, this becomes 

(6.1) log 10 (xy) = log 10 X + log 10 y ■ 

It is this fundamental property that makes logarithms particularly adaptable to computa- 
tions involving multiplication. The number 10 is useful as a base because real numbers 
are commonly written in the decimal system, and certain important numbers like 0.01, 
0.1, 1, 10, 100, 1000, . . . have for their logarithms the integers -2, -1, 0, 1, 2, 3, ... , 
respectively. 

It is not necessary to restrict ourselves to base 10. Any other positive base b 5 ^ 1 would 
serve equally well. Thus 

(6.2) u = log, x means x = b u , 

and the fundamental property in (6.1) becomes 


(6.3) 
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1°&> (xy) = log 6 x + log,, y . 



Motivation for the definition of the natural logarithm as an integral 


111 


If we examine the definition in (6.2) from a critical point of view, we find that it suffers 
from several logical gaps. First of all, to understand (6.2) we must know what is meant 
by b u . This is easy to define when u is an integer or a rational number (the quotient of two 
integers), but it is not a trivial matter to define b u when u is irrational. For example, how 
should we define 10 v ^? Even if we manage to obtain a satisfactory definition for b u , 
there are further difficulties to overcome before we can use (6.2) as a good definition of 
logarithms. It must be shown that for every x > 0, there actually exists a number u such 
that x = b u . Also, the law of exponents, b u b v = b u+v , must be established for all real 
exponents u and v in order to derive (6.3) from (6.2). 

It is possible to overcome these difficulties and arrive at a satisfactory definition of 
logarithms by this method, but the process is long and tedious. Fortunately, however, 
the study of logarithms can proceed in an entirely different way which is much simpler 
and which illustrates the power and elegance of the methods of calculus. The idea is to 
introduce logarithms first, and then use logarithms to define b u . 


6.2 Motivation for the definition of the natural logarithm as an integral 

The logarithm is an example of a mathematical concept that can be defined in many 
different ways. When a mathematician tries to formulate a definition of a concept, such 
as the logarithm, he usually has in mind a number of properties he wants this concept 
to have. By examining these properties, he is often led to a simple formula or process 
that might serve as a definition from which all the desired properties spring forth as logical 
deductions. We shall illustrate how this procedure may be used to arrive at the definition 
of the logarithm which is given in the next section. 

One of the properties we want logarithms to have is that the logarithm of a product 
should be the sum of the logarithms of the individual factors. Let us consider this property 
by itself and see where it leads us. If we think of the logarithm as a function/, then we 
want this function to have the property expressed by the formula 

(6.4) f(xy) =f(x) +f( V ) 

whenever x, y, and xy are in the domain off . 

An equation like (6.4), which expresses a relationship between the values of a function 
at two or more points, is called a functional equation. Many mathematical problems can 
be reduced to solving a functional equation, a solution being any function which satisfies 
the equation. Ordinarily an equation of this sort has many different solutions, and it is 
usually very difficult to find them all. It is easier to seek only those solutions which have 
some additional property such as continuity or differentiability. For the most part, these 
are the only solutions we are interested in anyway. We shall adopt this point of view and 
determine all differentiable solutions of (6.4). But first let us try to deduce what information 
we can from (6.4) alone, without any further restrictions on f . 

One solution of (6.4) is the function that is zero everywhere on the real axis. In fact, 
this is the only solution of (6.4) that is defined for all real numbers. Tt) prove this, letf 
be any function that satisfies (6.4). If 0 is in the domain off, then we may put y = 0 in 
(6.4) to obtain f (0) = f (X) + f (0), and this implies that f (x) = 0 for every x in the domain 

off . In other words, if 0 is in the domain off, thenfmust be identically zero. Therefore, 
a solution of (6.4) that is not identically zero cannot be defined at 0. 
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If f is a solution of (6.4) and if the domain off includes 1, we may put x = y = 1 in 
(6.4) to obtain f (1) = 2/(1), and this implies 

f(l) = 0. 

If both 1 and ™ 1 are in the domain off, we may take x = — 1 and y = — 1 to deduce 
that /(l) = 2/(— 1); hencef(-l) = 0. If now x, — X, 1, and — 1 are in the domain off, 
we may put y = — 1 in (6.4) to deduce f(-x) = /( — 1) +f(x) and, since f ( - 1) =0, 

we find 

/(“*) =/(*)• 

In other words, any solution of (6.4) is necessarily an even function, 

Suppose, now, we assume that f has a derivative f ‘(x) at each x ^ 0. If we hold y fixed 
in (6.4) and differentiate with respect to x (using the chain rule on the left), we find 

yf '(xy) =f ‘(X) 

When x = 1, this gives us yf'iy) — //l), and hence we have 


/'GO 


n i) 

Y 


for each y # 0 . 


From this equation we see that the derivative /’ is monotonic and hence integrable on 
every closed interval not containing the origin. Also, /’ is continuous on every such interval, 
and we may apply the second fundamental theorem of calculus to write 

fix) -fic) = f f'(t) dt = /'(l)f - dt . 

J c C Jet 


If x > 0, this equation holds for any positive c, and if x < 0, it holds for any negative c. 
Since f( 1) = 0, the choice c = 1 gives us 


/(*) = /'( l)J’f 

Ji t 


■ dt if x > 0 
If x is negative then — x is positive and, since f (x) =/(-x), we find 


fix) = f 


wf- i 

Ju t 


dt if x < 0. 


These two formulas for fix) may be combined into one formula that is vaiid for both 
positive and negative x, namely, 


(6.5) 



if x ^ 0 . 


Therefore we have shown that if there is a solution of (6.4) which has a derivative at each 
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point x 0, then this solution must necessarily be given by the integral formula in (6.5). 
If /'( 1) = 0, then (6.5) implies that f(x) = 0 for all s ^0, and this solution agrees with 
the solution that is identically zero. Therefore, if f is not identically zero, we must have 
f (1) 0, in which case we can divide both sides of (6.5) \xy f'(\) to obtain 



where g(x) = /(x)//'( 1). The function g is also a solution of (6.4), since cf is a solution 
whenever f is. This proves that if (6.4) has a solution that is not identically zero and if 
this solution has a derivative everywhere except at the origin, then the function g given 
by (6.6) is also a solution, and all solutions may be obtained from this one by multiplying 
g by a suitable constant. 

It should be emphasized that this argument does not prove that the function g in (6.6) 
actually is a solution, because we derived (6.6) on the assumption that there is at least one 
solution that is not identically zero. Formula (6.6) suggests a way to construct such a 
solution. We simply operate in reverse. That is, we use the integral in (6.6) to define a 
function g, and then we verify directly that this function actually satisfies (6.4). This 
suggests that we should define the logarithm to be the function g given by (6.6). If we 
did SO, this function would have the property that g(-x) = g(x) or, in other words, 
distinct numbers would have the same logarithm. For some of the things we want to do 
later, it is preferable to define the logarithm in such a way that no two distinct numbers 
have the same logarithm. This latter property may be achieved by defining the logarithm 
only for positive numbers. Therefore we use the following definition. 


6.3 The definition of the logarithm. Basic properties 

definition. If x is a positive real number, we define the natural logarithm of x, denoted 
temporarily by L(x), to be the integral 


(6.7) 


L(x) 



When x > 1, L(x) may be interpreted geometrically as the area of the shaded region 
shown in Figure 6.1. 

theorem 6.1. The logarithm function has the following properties: 

(a) L(l) = 0. 

1 

(b) L'(x) = - for every X > 0. 

x 

(c) L(ab) = L(a) + L(b) for every a > 0, b > 0. 

Proof. Part (a) follows at once from the definition. To prove (b), we simply note that 
L is an indefinite integral of a continuous function and apply the first fundamental theorem 
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of calculus. Property (c) follows from the additive property of the integral. We write 


L(ab) = 


j ab *1 = J“ dt + | a! ’ dt 



In the last integral we make the substitution u = tja, du = dtfa, and we find that the 
integral reduces to L(b), thus proving (c). 



Figure 6.1 Interpretation of the log- Figure 6.2 The graph of the natural log- 
arithm as an area. arithm. 


6.4 The graph of the natural logarithm 

The graph of the logarithm function has the general shape shown in Figure 6.2. Many 
properties of this curve can be discovered without undue calculation simply by referring 
to the properties in Theorem 6.1. For example, from (b) we see that L has a positive 
derivative everywhere so it is strictly increasing on every interval. Since L(l) = 0, the 
graph lies above the x-axis if x > 1 and below the axis if 0 < x < 1. The curve has slope 
1 when x = 1. For x > 1, the slope gradually decreases toward zero as X increases 
indefinitely. For small values of x, the slope is large and, moreover, it increases without 
bound as x decreases toward zero. The second derivative is L”(x) = — 1/x 2 which is 
negative for all x, SO L is a concave function. 

6.5 Consequences of the functional equation L(ab ) = L (a) + L{b) 

Since the graph of the logarithm tends to level off as x increases indefinitely, it might 
be suspected that the values of L have an upper bound. Actually, the function is unbounded 
above; that is, for every positive number M (no matter how large) there exist values of x 
SUCh that 


( 6 . 8 ) 


L(x) > M , 



Consequences of the functional equation L(ab) = L(a) + L(b) 
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We can deduce this from the functional equation. When a = b, we get L{a 2 ) = 2 L(a). 
Using the functional equation once more with b = a 2 , we obtain L(a s ) = 3L(a). By 
induction we find the general formula 


L( a ”) = nL( a ) 

for every integer n > 1. When a = 2, this becomes L( 2 n ) = nL( 2), and hence we have 

(6 9) L(2" ) > M when n > . 

( > L( 2) 

This proves the assertion in (6.8). Taking b = 1/a in the functional equation, we find 
L(l/a) = -L(a). In particular, when a = 2 n , where n is chosen as in (6.9), we have 

-t(2") < -m, 

which shows that there is also no lower bound to the function values. 

Finally we observe that the graph crosses every horizontal line exactly once. That is, 
given an arbitrary real number b (positive, negative, or zero), there is one and only one 
a > 0 such that 

(6.10) L(a) = b . 

To prove this we can argue as follows: If b > 0, choose any integer n > i/L( 2). Then 
L( 2 n ) > b because of (6.9). Now examine the function L on the closed interval [1, 2"]. 
Its value at the left endpoint is L( 1) = 0, and its value at the right endpoint is L{ 2"). 
Since 0 < b < L(2 n ). the intermediate-value theorem for continuous functions (Theorem 
3.8 in Section 3.10) guarantees the existence of at least one a such that L(a) = b. There 
cannot be another value a' such that L(a’) = b because this would mean L(a) = L(a’) 
for a a’, thus contradicting the increasing property of the logarithm. Therefore the 
assertion in (6.10) has been proved for b > 0. The proof for negative b follows from this 
if we use the equation L(\ja) = -L(a). In other words, we have proved the following. 

THEOREM 6.2. For every real number b there is exactly one positive real number a whose 
logarithm. L( a ), is equal to b. 


In particular, there is exactly one number whose natural logarithm is equal to 1. This 
number, like tt, occurs repeatedly in so many mathematical formulas that it was inevitable 
that a special symbol would be adopted for it. Leonard Euler (1707-1783) seems to have 
been the first to recognize the importance of this number, and he modestly denoted it 
by e, a notation which soon became standard. 


definition. We denote by e that numberfor which 


(6.11) 


L(e) = 1 . 
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In Chapter 7 we shall obtain explicit formulas that enable us to calculate the decimal 
expansion of e to any desired degree of accuracy. Its value, correct to ten decimal places, 
is 2.7182818285. In Chapter 7 we also prove that e is irrational. 

Natural logarithms are also called Napierian Zogarithms, in honor of their inventor, 
John Napier (1550-1617). It is common practice to use the symbols In x or log x instead 
of L(x ) to denote the logarithm of x. 


6.6 Logarithms referred to any positive base b 7^ 1 

The work of Section 6.2 tells us that the most general f which is differentiable on the 
positive real axis and which satisfies the functional equation f(xy ) = f(x) + f(y) is given 
by the formula 

(6.12) f(x) = c log x , 


where c is a constant. For each c, we could call thisf(x) the logarithm of x associated with 
c although, of course, its value would not be necessarily the same as the natural logarithm 
of x. When c = 0, f / S identically zero, so this case is uninteresting. If c ^ 0, we may 
indicate in another way the dependence off on c by introducing the concept of a base 
for logarithms. 

From (6.12) we see that when c /0, there exists a unique real number b > 0 such that 
f(b) ;= 1. This b is related to c by the equation c log b = 1; hence b ^ 1, c = 1 /log b, 
and ( 6 . 12 ) becomes 


/(*) = 


log* 
log b ' 


For this choice of c we say that f (x) is the logarithm ofx to the base b and we write log, x 
for f(x). 


definition. If b 0, b 5 ^ 1, and if x f 0, the logarithm ofx to the base b is the number 

log x 

los ' x = i 7Tb' 


where the logarithms on the right are natural logarithms. 


Note that log, b = 1. Also, when b = e, we have log, x = log x, so natural logarithms 
are those with base e. Since logarithms to base e are used so frequently in mathematics, 
the word logarithm almost invariably means natural logarithm. Later, in Section 6.15, 
we shall define b u in such a way that the equation b u = x will mean exactly the same as the 
equation u = log, x. 

Since logarithms to the base b are obtained from natural logarithms by multiplying by 
the constant 1/log b, the graph of the equation y = log, x may be obtained from that of 
the equation y = log x by simply multiplying all ordinates by the same factor. When 
b > 1, this factor is positive, and, when b < I, it is negative. Examples with b > 1 are 
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Figure 6.3 The graph of y = log 6 x for various values of b, 


shown in Figure 6.3(a). When b < 1, we note that Ijb > 1 and log b = —log (1 /b), SO 
the graph of y = log, x may be obtained from that of y = log 1/6 x by reflection through 
the x-axis. Examples are shown in Figure 6.3(b). 


6.7 Differentiation and integration formulas involving logarithms 

Since the derivative of the logarithm is given by the formula D log X = l/x for x > 0, 
we have the integration formula 

J* ^ dx = log x + C . 

More generally, if U = ffx), wheref has a continuous derivative, we have 


(6.13) 


j d u 

J u 


log u + C or 


J ® * = log/ « + c • 


Some care must be exercised when using (6.13) because the logarithm is not defined for 
negative numbers. Therefore, the integration formulas in (6.13) are valid only if w, or 
f(x), is positive. 

Fortunately it is easy to extend the range of validity of these formulas to accommodate 
functions that are negative or positive (but nonzero). We simply introduce a new function 
L 0 defined for all real x ^ 0 by the equation 


f M 1 

(6.14) L,(x) = log |x| = J ~dt, 


a definition suggested by Equation (6.6) of Section 6.2. The graph of L Q is symmetric 
about the y-axis, as shown in Figure 6.4. The portion to the right of the y-axis is exactly 
the same as the logarithmic curve of Figure 6.2. 
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Since log \xy\ = log (|x| |j|) = log |x| + log |y|, the function L 0 also satisfies the basic 
functional equation in (6.4). That is, we have 

L 0 (xy) = L 0 (x) + L 0 (y ) 

for all real x and y except 0. For x > 0, we have L' 0 (x) - \jx since L,(x) is the same as 
log x for positive x. This derivative formula also holds for x < 0 because, in this case, 
L,(x) = L(-x), and hence L' 0 (x) — — L’( —x) = — 1 j(—x) — 1/x. Therefore we have 

(6.15) Lo(x) = - for all real x^O . 

X 



Hence, if we use L 0 instead of L in the foregoing integration formulas, we can extend 
their scope to include functions which assume negative values as well as positive values. 
For example, (6.13) can be generalized as follows: 

( 6 . 16 ) J^=log M+c, J-®(/x = log |/(x)| +c. 

Of course, when we use (6.16) along with the second fundamental theorem of calculus to 
evaluate a definite integral, we must avoid intervals that include points where u or 
f(x) might be zero. 

EXAMPLE 1. Integrate J tan x dx. 

Solution. The integral has the form — J du/u, where u= cos x, du = — sinx dx. There- 
fore we have 

j" tan x dx = — j" — = —log |u| + c = —log |cos x\ + C , 
a formula which is valid on any interval in which cos x ^ 0. 
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The next two examples illustrate the use of integration by parts. 
example 2. Integrate J log x dx. 

Solution. Let u = log x, du = dx. Then du = dx/x, v = x, and we obtain 


j* log x dx = j u dv = uv — j v du = x log x— ^ x- dx = x log x — x + C . 


EXAMPLE 3. Integrate J sin (log x) dx. 

Solution. Let u = sin (log x), v = x. Then du = cos (log x)( 1 /x) dx, and we find 

J sin (log x) dx = j u dv = uv — j vdu = x sin (log x) — J cos (log x) dx . 
In the last integral we use integration by parts once more to get 


J cos (log X) dx = X COS (log X) + J sin (log x) dx . 
Combining this with the foregoing equation, we find that 


J sin (log x) dx = |x sin (log x) — \x COS (log x) + C , 

and 

[ cos (log x) dx = lx sin (log xj + |x cos (log x) + C . 


6.8 Logarithmic differentiation 

We shall describe now a technique known as logarithmic differentiation which is often 
a great help in computing derivatives. The method was developed in 1697 by Johann 
Bernoulli (1667-1748), and all it amounts to is a simple application of the chain rule. 
Suppose we form the composition of L 0 with any differentiable function f\ say we let 

g(x) = !„[/(*)] = log \f(x)\ 

for those x such that f(x) ^ 0. The chain rule, used in conjunction with (6.15), yields the 
formula 

(6.17) g’(x) = L,[/(x)]*f’(x) 

/(*). 

If the derivative g’(x) can be found in some other way, then we may use (6.17) to obtain 
f'(x) by simply multiplying g’(x) by f(x). The process is useful in practice because in 
many cases g’(x) is easier to compute than/’(x) itself. In particular, this is true when / is 
a product or quotient of several simpler functions. The following example is typical. 
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EXAMPLE. Compute /'(x) if f(x) — X 2 COS X (1 + X 4 ) -7 . 

Solution. We take the logarithm of the absolute value off(x) and then we differentiate. 
Let 


g(x) = log \f(x)\ = log x 2 + log (COS x\ + log (1 + x4)-’ 
= 2 log \x\ + log |cos x\ -1 log (1 + x”). 


Differentiation yields 

0 , ( s . _ AV _ 2 sjn x _ 28x 3 _ 

g f(x) x COS x 1 + x 4 ’ 

Multiplying by / (x), we obtain 

_ 2x COS x x 2 sin x 28x 5 cos x 
1 ~ (1 + x 4 ) 7 “ (1 + x 4 ) 7 " (1 + X 4 ) 8 . 


6.9 Exercises 


1. (a) Find all c such that log x = c + |* r 1 dt for all x > 0. 

(b) Let f(x) = log [(1 + x)/(l — x)] if x > 0. If a and b are given numbers, with ab ^ — 1, 

find all x such that f(x) = f (a) + f(b). 

2. In each case, find a real x satisfying the given equation. 

(a) log (1 + x) = log (1 - x). (c) 2 log x = x log 2, x ^ 2. 

(b) log (1 + x) = 1 + log(l - x). (d) log(Vx + Vx + 1) = 1. 

3. Let fix) = (logx)/x if x > 0. Describe the intervals in which f is increasing, decreasing, 
convex, and concave. Sketch the graph off. 


In Exercises 4 through 15, find the derivative/'(x). In each case, the function/is assumed to be 
defined for all real x for which the given formula for f(x) is meaningful. 

4. f(x) = log (1 + x 2 ). 10. fix) = (x + Vl +X 2 )* 


5. f(x) = log Vl + x 2 . 

6. f(x) = log \/4 — x 2 . 
7 .fix) = log (log x). 

8. f(x) = log(x 2 log x). 


11. f(x) = Vx + 1 — log (1 + V x + 1). 

12. fix) = x log (x + Vl + x 2 ) - Vl + x 2 . 

1 Va + x\fb 

13 -/ ( x )= 2 V^ l 0 g v^-x\/V 


9. f(X) = i log 


x 2 - 1 
x 2 + 1 ' 


14. f(x) = x[sin (log x) — Cos (log x)]. 

15. f(x) = log, e. 


16 l 


In Exercises 16 through 26, evaluate the integrals. 
dx 


2 + 3x ’ 

17. J log 2 x dx. 

18. Jx log x dx. 

19. j x log 2 x dx. 


ft- 1 dt 

20 . . 

Jo 1 + 1 

21. J cot x dx. 

22. jx" log ( ax) dx. 

23. jx 2 log 2 x dx. 
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f dx 

" 4 ‘ J X log X ' 

25. jr io fi7^ «. 

27. Derive the recursion formula 


26. 


’ log 1*1 
J xV 1 + log |x| 


dx. 


J 


L.W+1 


x m log” x dx = 


lOg" 


/n + 1 


ttJ' 


1 log 1 


71—1 5 


dx 


and use it to integrate Jx 3 log 3 x dx. 

28. (a) If x > 0, let f(x) = x — 1 — log x, g(x) = log x — 1 + 1/x. Examine the signs off 
and g' to prove that the inequalities 


1 < log X < X — 1 

% 

are valid for x > 0, x yt I . When x = 1, they become equalities. 

(b) Sketch graphs of the functions A and B defined by the equations A(x) = x — 1 and 
B(x) =1-1 lx for x > 0, and interpret geometrically the inequalities in part (a). 

29. Prove the limit relation 


'°g 0 + x) = j 




by the following two methods: (a) using the definition of the derivative L’(l); (b) using the 
result of Exercise 28. 

30. If a > 0, use the functional equation for the logarithm to prove that log ( a r ) = r log a for 
every rational number Y. 

31. Let P = fa„ a h a 2 , .... a, ( be any partition of the interval [1, x], where x > 1. 

(a) Integrate suitable step functions that are constant on the open subintervals of P to derive 
the following inequalities : 



logx < 


V / a k - gfc- 1 \ 
a,c ~ x ' 


(b) Interpret the inequalities of part (a) geometrically in terms of areas. 

(c) Specialize the partition to show that for every integer n> 1, 


n j 

2k <logn< 


71—1 


k = 2 


tk' 


32. Prove the following formulas for changing from one logarithmic base to another: 

loga * 


(a) logfj x = log„ a log, x; 


(b) log t x = 


log« b ' 

33. Given that log, 10 = 2.302585, correct to six decimal places, compute log 10 e using one of the 
formulas in Exercise 32. How many correct decimal places can you be certain of in the result 
of your calculation? Note: A table, correct to six decimal places, gives log 10 e = 0.434294. 
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34. A function f continuous on the positive real axis, has the property that for all choices of 
X > 0 and y > 0, the integral 

is independent of x (and therefore depends only on y). If f(2) = 2, compute the value of the 
integral A(x) = Ji/(0 dt for all X > 0. 

35. A function continuous on the positive real axis, has the property that 

J>) dt = dt + x jl f(t ) dt 

for all x > 0 and all y > 0. If /(l) = 3, compute f(x) for each x > 0. 

36. The base of a solid is the ordinate set of a function /which is continuous over the interval 
[1, a]. All cross sections perpendicular to the interval [1, «/ are squares. The volume of the 
solid is |a 3 log 2 a — fa 3 log a + ~a 3 — for every a > 1. Compute f(u). 


6.10 Polynomial approximations to the logarithm 


In this section we will show that the logarithm function can be approximated by certain 
polynomials which can be used to compute logarithms to any desired degree of accuracy. 

To simplify the resulting formulas, we first replace x by 1 — x in the integral defining 
the logarithm to obtain 


log (1 


x) 


- r~ x 

Ji t ’ 


which is valid if X < 1. The change of variable t = 1 ™ U converts this to the form 

C* rju 

— log (1 — x ) = valid for x < 1- 

JO 1 - u ’ 


Now we approximate the integrand 1/(1 — u ) by polynomials which we then integrate to 
obtain corresponding approximations for the logarithm. To illustrate the method, we 
begin with a simple linear approximation to the integrand. 

From the algebraic identity 1 — w 2 = (1 — w)(l + u), we obtain the formula 

(6.18) ” ~~ = 1 W + —— 

1 - U 1 - u 

valid for any real U ^ 1. Integrating this from 0 to x, where x < 1, we have 

x 2 2 

(6.19) — log (1 — x) = x + — + - U d u . 

2 JO 1 - u 

The graph of the quadratic polynomial P(x) = x + |x 2 which appears on the right of 

(6.19) is shown in Figure 6.5 along with the curve y = —log (1 — x). Note that for x near 
zero the polynomial P(x) is a good approximation to —log (1 — x). In the next theorem 
we use a polynomial of degree n — 1 to approximate 1/(1 — u), and thereby obtain a 
polynomial of degree n which approximates log (1 — x). 
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y 



theorem 6.3. Let P n denote the polynomial of degree n given by 


X X X 

Pn(*) = X + ^ + -+---+~ 
2 3 n 



Then, for every x < l and every n > 1, we have 

(6.20) -log (1 - x) = P,(x) + I — 

J.t 1 


du . 


m 


Proof. From the algebraic identity 

1 — u n = (1 — w)(l + u + w 2 + . . . + w n_1 ), 

we obtain the formula 


1 - u 


l + u+ u + ... + u + 


1 — M 


which is valid for M ^ 1. Integrating this from 0 to x, where x < 1, we obtain (6.20). 
We can rewrite (6.20) in the form 

(6.2f) -log (I - x) = P,(x) + EJx), 

where E,(x) is the integral, 


T u n 

-,(x) = — 

Jit — u 


du . 
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The quantity E,(x) represents the error made when we approximate —log (1 — x) by the 
polynomial P,(x). To use (6.21) in computations, we need to know whether the error is 
positive or negative and how large it can be. The next theorem tells us that for small 
positive x the error E,(x) is positive, but for negative x the error has the same sign as 
(— ly+T where n is the degree of the approximating polynomial. The theorem also gives 
useful upper and lower bounds for the error. 


THEOREM 6.4. !fo<x< l, we have the inequalities 


( 6 . 22 ) 


x” +1 

n + 1 


K E, (X) 


< 


1 x n+1 

1 — x n + 1 


If x < 0, the error E,(x) has the same sign as (- 1)" +1 , and w e have 


I v I ”+ 1 

(6.23) 0<(-l 

n + 1 

Proof. Assume that 0 < x < 1. In the integral defining E.(x) we have 0 < u < x, so 
1 — x < 1 — u < 1, and hence the integrand satisfies the inequalities 


u n < 


< 


1 - u l — x 

Integrating these inequalities, we obtain (6.22). 

To prove (6.23), assume x < 0 and let / = —x = |x|. Then t > 0 and we have 

E,(x) = E n (—t) = \ du = -I'f—'dv = (-1 T +1 \ t -~--dv. 

Js 1 — u *'0 1 + v Jo 1 + V 

This shows that E,(x) has the same sign as (- 1)" +1 . Also, we have 


/■<,,« ft ,»+ 1 

(-1 r +1 E n (x) = -0— dv < \v n dv = - — 

Jo 1 + v J« n + 1 


|oc| n+1 

n + 1 


which compfetes the proof of (6.23). 

The next theorem gives a formula which is admirably suited for computations of loga- 
rithms. 


theorem 6.5. If 0 < x <^\ and if w > 1, we have 

, 1 + x J , x 3 , , x 2 ”*- 1 

log — 21 x H f- • • • -( 

1 - x V 3 2m — 1 


RJx) 
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where the error term, R„(x), satisfies the inequalities 


(6.24) 


v 2m+l O 

< R,„(x) < z 


X x 


2m+l 


2 m + 1 


1 — x 2m + 1 


Proof. Equation (6.21) is valid for any real x < 1. If we replace x by — x in (6.21), 
keeping x > — 1, we obtain the formula 


(6.25) -log (1 + x) = P»(-x) + E n (—x ) . 

If -1 < x < 1, both (6.21) and (6.25) are valid. Subtracting (6.25) from (6.21), we find 


(6.26) log = P,(x) - P n (-x) + E,(x) - E n (-x) . 

In the difference P,(x) — P. n (—x), the even powers of x cancel and the odd powers double 
up. Therefore, if n is even, say n = 2m, we have 

x 3 x 1 2 ™ -1 \ 

P‘2,n(x) ~ Ptoni-X) = 2 X +-+... + , 

(3 2m — 1/ 

and Equation (6.26) becomes 

1 ° 8 rH =2 (* + 3' + ,+ ihn) + Rjxh 

where R,(x) = E im (x) — E 2m (—x). This formula is valid if x lies in the open interval 
— 1 < x < 1. Now we restrict x to the interval 0 < x < 1. Then the estimates of Theorem 
6.4 give us 




< E 2 m (x) < 


1 




and 


0 < -Eimi-X) < 


„2t#+l 


2m + 1 1— x2m + l 2m + 1 

Adding these, we obtain the inequalities in (6.24), since 1+ 1/(1 - x) = (2 _ x)l( 1 - x). 

example. Taking m — 2 and x = l, we have (1 + x)/(l — x) = 2, and we obtain the 
formula 


log 2 = 2(j + k'i) + ^ 2 ( 3 ) > where 


Ki) 5 < *2 (i)<Hi ) 5 = xh. 


This gives us the inequalities 0.6921 < log 2 < 0.6935 with very little calculation. 
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6.11 Exercises 

1. Use Theorem 6.5 with x = | and m = 5 to calculate approximations to log 2. Retain nine 
decimals in your calculations and obtain the inequalities 0.6931460 < log 2 < 0.6931476. 

2. If x = f, then (1 + x)j(\ — x) = f . Thus, Theorem 6.5 enables us to compute log 3 in terms 
of log 2. Take x = 5 and m = 5 in Theorem 6.5 and use the results of Exercise 1 to obtain the 
inequalities 1.098611 < log 3 < 1.098617. 

Note: Since log 2 < log e < log 3, it follows that 2 < e < 3. 

J. Use Theorem 6.5 with x = 9 to calculate log 5 in terms of log 2. Choose the degree of the 
approximating polynomial high enough to obtain the inequalities 1.609435 < log 5 < 1.609438. 

4. Use Theorem 6.5 with x = J to calculate log 7 in terms of log 5. Choose the degree of the 
approximating polynomial high enough to obtain the inequalities 1.945907 < log 7 < 1.945911. 

j. Use the results of Exercises 1 through 4 to calculate a short table listing log n for n = 2, 3, . . . , 
10. Tabulate each entry with as many correct decimal places as you can be certain of from the 
inequalities in Exercises 1 through 4. 


6.12 The exponential function 

Theorem 6.2 shows that for every real x there is one and only oney such that L(y) = x. 
Therefore we can use the process of inversion to define y as a function of x. The resulting 
inverse function is called the exponentialfinction, or the antilogarithm, and is denoted by E. 

definition. For any real x, we define E(x) to be that number y whose logarithm is x. 
That is, y = E(x) means that L(y) = x. 

The domain of E is the entire real axis; its range is the set of positive real numbers. The 
graph of E, which is shown in Figure 6.6, is obtained from the graph of the logarithm by 


Y 



Figure 6.6 The graph of the exponential function is obtained from that of the 
logarithm by reflection through the line y = x. 
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reflection through the line y = x. Since L and E are inverses of each other, we have 

L[E(x) ] = x for all x and E\L(y)\ = y for all y > 0. 

Each property of the logarithm can be translated into a property of the exponential. 
For example, since the logarithm is strictly increasing and continuous on the positive real 
axis, it follows from Theorem 3.10 that the exponential is strictly increasing and continuous 
on the entire real axis. The counterpart of Theorem 6.1 is given by the following theorem. 


theorem 6, 6 , The exponential function has the following properties: 

(a) £(0) = 1, E(l) = e. 

(b) E’(x) = E(x) for every x. 

(c) E(a + b) = E(a)E{b) for all a and b. 


Proof. Part (a) follows from the equations L(l) = 0 and L(e) = 1. Next we prove (c), 
the functional equation for the exponential. Assume that a and b are given and let 


x = E(a), y = E(b ) , c = L(xy) . 

Then we have 

L(x) = a, L(y) = b , E(c) = xy . 

But c = L(xy) = L(x) + L(y) = a + b. That is, c = a + b. Hence, E(c) = E(a + b). 
On the other hand, E(c) = xy = E(a)E(b), so E( a + b) = E(a)E(b), which proves (c). 

Now we use the functional equation to help us prove (b). The difference quotient for 
the derivative E’(x) is 


E(x + h) — E(x) _ E(x)E(h) — E(x) _ 

1 * “ ^ 


E(h ) - 1 


Therefore, to prove (b) we must show that 


(6,27) 


lim 

7i-*0 


E(h) - 1 

h 


1 , 


We shall express the quotient in (6.27) in terms of the logarithm. Let k — E(h) — 1. 
Then k + 1 = E(h) so L(k + 1) = h and the quotient is equal to 


( 6 , 28 ) 


E(h) - 7 k_ 

h Z(fcTl) ' 


Now as h —>-0, E(h) ^->1 because the exponential function is continuous at 1. Since 
k = E(h)- 1, we have k ->-0 as h-*0. But 

L(k + 1) = L(k + 1) - L( 1) ^ = { as k_> 0 , 

k k 


In view of (6.28), this proves (6.27) which, in turn, proves (b). 
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6.13 Exponentials expressed as powers of e 

The functional equation E(a + b) = E(o)E(b) has many interesting consequences. For 
example, we can use it to prove that 

(6.29) E(r) = e r 

for every rational number r. 

First we take b = -a in the functional equation to get 

E(a)E(-a) = E(0) = 1 , 

and hence E(-a) = 1 jE(a) for every real a. Taking b = a, b = 2a, . . . , b = na in the 
functional equation we obtain, successively, E(2a) = E(a ) 2 , E(ia) = E(af, and, in general, 
we have 

(6.30) E(na) = E(a)" 

for every positive integer n, In particular, when 3=1, we obtain 

£ (n) = e n , 


whereas for a = 1 /«, we obtain E(f) = E(ljn) n . Since E{\jn) > 0, this implies 


(6.31) 



e Vn , 


Therefore, if we put a = 1/m in (6.30) and use (6.31), we find 



for all positive integers m and n. In other words, we have proved (6.29) for every positive 
rational number r. Since E(-r) = l/£(r) = e~ r , it also holds for all negative rational r. 


6.14 The definition of e* for arbitrary real x 

In the foregoing section weproved that e x = E(x) when X is any rational number. Now 
we shall define e x for irrational x by writing 

(6.32) e x = E(x) for every real x . 

One justification for this definition is that we can use it to prove that the law of exponents 

(6.33) e a e b = e a+t 

is valid for all real exponents a and b. When we use the definition in (6.32), the proof of 

(6.33) is a triviality because (6.33) is nothing but a restatement of the functional equation. 
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The notation e x for E(x) is the one that is commonly used for the exponential. Occasion- 
ally exp(x) is written instead of e x , especially when complicated formulas appear in the 
exponent. We shall continue to use E(x ) from time to time in this chapter, but later we 
shall switch to e x . 

We have defined the exponential function so that the two equations 

y = e x and x = logy 

mean exactly the same thing. In the next section we shall define more general powers so 
that the two equations y = cf and x = log, y will be equivalent. 


6.15 The definition of a* for a > 0 and x real 

Now that we have defined e x for arbitrary real x, there is absolutely no difficulty in 
formulating a definition of a x for every a > 0. One way to proceed is to let cf- denote that 
number y such that log, y = x. But this does not work for a = 1, since logarithms to the 
base 1 have not been defined. Another way is to define of by the formula 

(6.34) cf = e xXa " a 

The second method is preferable because, first of all, it is meaningful for all positive a 
(including a = 1) and, secondly, it makes it easy to prove the following properties of 
exponentials: 

log a" = x log a . (ab) x = cfb x . 
cfa y = cf +v (a x f - (a v ) x - <f v ■ 

If a ^ 1 , then y = cf if and only if x = log a y . 

The proofs of these properties are left as exercises for the reader. 

Just as the graph of the exponential function was obtained from that of the logarithm 
by reflection through the line y = x, so the graph of y = a" can be obtained from that 
of y = log, x by reflection through the same line; examples are shown in Figure 6.7. The 
curves in Figures 6.7 were obtained by reflection of those in Figures 6.3. The graph 
corresponding to a = 1 is, of course, the horizontal line y = 1. 


6.16 Differentiation and integration formulas involving exponentials 

One of the most remarkable properties of the exponential function is the formula 

(6.35) E'(x) = E(x) , 

which tells us that this function is its own derivative. If we use this along with the chain 
rule, we can obtain differentiation formulas for exponential functions with any positive 
base a. 

Suppose f(x) = a x for x > 0. By the definition of cf , we may write 

f(x) - e xl0&a = E(x log a) ; 
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hence, by the chain rule, we find 

(6.36) f'(x) = E\x log a) > log a = E(x log a) 'log a = a" log a. 

In other words, differentiation of a x simply multiplies a” by the constant factor log a, this 
factor being 1 when a = e. 

y 
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where U now represents any function with a continuous derivative. If we write u = fix), 
and du = f'(x) dx, the formulas in (6.39) become 



the second of these being valid for a > 0, a ^ 1. 
example 1. Integrate jx 2 e x3 dx. 

Solution. Let u = .v 3 . Then du = 3x 2 dx, and we obtain 

| x V 3 dx = £ J e* 3 (3x 2 dx) = i J e u du = \e u + C = Je* 3 + C . 
f 2 ^ 

EXAMPLE 2. Integrate —= dx . 

J Vx 

Solution. Let u = Vx = x Yi . Then du = \x'"' A dx = \ dxjVx. Hence we have 



example 3. Integrate J cos x e 2 sin x dx. 

Solution. Let u = 2 sin x. Then du =2 cos x dx, and hence we obtain 
J cos x e 2sin x dx = i | sin *(2 cos x dx) = 1 j e u du = ±e u + C = le 2sinx + C. 
example 4. Integrate J e x sin x dx. 

Solution. Let u = e x , du = Sin x dx. Then du = e x dx, v = —COS x, and we find 

(6.40) je x sin x dx = J u du = uv — J vdu = — e x cos x + [ e x cos x dx + C . 

The integral J e x cos x dx is treated in the same way. We let u = e x , dv = COS X dx, du = 
e x dx, v = sin x, and we obtain 

(6.41) [ e x cos X dx = e x sin x — J e x sin X dx + C . 

Substituting this in (6.40), we may solve for J" e x sin x dx and consolidate the arbitrary 
constants to obtain 

f e* 

e x sin x dx = — (sin x — cos x) + C . 

J 2 

Notice that we can use this in (6.41) to obtain also 

f e x 

J e x cos x dx = — (cos x + sin x) + C . 
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EXAMPLE 5 . Integrate J y 
Solution. One wa y to treat this example is to rewrite the integrand as follows: 


dx 

+ e x 




1 + e x e x + 1 

Now put u = e~ x + 1. Then du = —e~ x dx, and we get 

~ J = -/ it = ->°8 l“l + c = -log a + + c . 

The result can be written in other ways if we manipulate the logarithm. For instance, 

-log ( 1 + e~ x ) = log l -— x = log 


1 + e~ x V + 1 

= log (e x ) — log (e x + 1) = x — log (1 + e x ) 
Another way to treat this same example is to write 


1 + e* 


= 1 


1 + 


Then we have 


f dx f e x , f du 

!TT? = x ~!Tr? dx " x - J 7' 


where u = 1 + e”. Thus we find 


dx 


1 + e* 

which is one of the forms obtained above. 


= x - log(l + e x ) + C, 


6.17 Exercises 

In Exercises 1 through 12, find the derivative f'(x). In each case the functionfis assumed to be 


defined for all real x 

for which the given formula for f(x) is meaningful. 

1. f(x) = e 3 * -1 . 

7. f(x) = 2 X * [which means 2 ( * z> ]. 

2. f(x) = e ix \ 

8. f(x) = g sin * 

3. f(x) = e~ x \ 

9. f(x) = e cosS x . 

4. fix) = e V5 . 

10. f(x) = e log x . 

5. f(x:) = e 1/x . 

11. f(x) = e eX [which means e*® 1 *]. 

6. f(x) = 2”. 

12 .f(x)-e e ‘ [which means exp (e 1 
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Evaluate the indefinite integrals in Exercises 13 through 18. 

13 .j xe x dx. 16. jx 2 e~ 2x dx. 

14. j x e~ x dx. 17. j? Vx dx. 

15 . fx^dx. 18. jx 3 e~ xl dx. 

19. Determine all constants a and b such that e x = b + J* e l dt. 

20. Let A = J e ax COS bx dx and B = J e ax sin bx dx, where a and b are constants, not both zero. 

Use integration by parts to show that 

aA — bB = e ax cos bx + C lt aB + bA= e ax sin bx + C 2 , 

where C 1 and C 2 are arbitrary constants. Solve for A and B to deduce the following integration 

formulas : 


e ax cos bx dx 


e ax (a cos bx +b sin bx) 

a 2 + b 2 


+ C, 



sin bx dx 


e ax (a sin bx —b cos bx) 
a 2 + b 2 


+ C. 


In Exercises 21 through 34, find the derivative/'!*)' h each case, the functionfis assumed to be 
defined for all real x for which the given formula for f(x) is meaningful. Logarithmic differenti- 
ation may simplify the work in some cases. 

21. f(x) = x x . 28. f(x) = (log xf. 

22. f(x) = (1 + x)(l + e x2 ). 29. f(x) = x l0 «L 


23. f{x) 


e x - e~ x 
e f + e~ x ' 


30. fix) = 


(log *) 

^log X 


24. f(x) = x 8 " + <f a + a® 1 . 

25. fix) = log [log (log x)]. 


26. f(x) = log (c' + Vl + e 2x ). 


31. f(x) = (sin x) C09 x + (cos x) sin *, 

32. f(x) = x 1/x . 

„ x 2 (3 - x//3 

33 ‘ f {x) = (1 -x)(3+ x) 2 / 3 ' 


27. f(x) = x* 1 . 34. fix) = JJ (x — a t ) bi , 

%- 1 

35. Let f(x) = x r , where x > 0 and r is any real number. The formula f(x) = rx r-J was proved 
earlier for rational r. 

(a) Show that this formula also holds for arbitrary real r. [Hint: Write x r = e r],l " x .] 

(b) Discuss under what conditions the result of part (a) applies for x < 0. 

36. Use the definition a” = & loga to derive the following properties of general exponentials: 

(a) log a x = x log a. 

(b) (ab) x = a x b x . 

(c) a x a y = a x+y . 

(d) {a x ) y = {a y ) x = a xy . 

(e) If a ^ 1, then y = a x if and only if x = log,, y. 

37. Let f(x) = |(a x + ar x ) if a > 0. Show that 


fix + y ) + fix - y) = 2 f(x)f(y) . 
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38. Let/(x) = e cx , where c is a constant. Show thatf(O) = c, and use this to deduce the following 
limit relation: 


gcx _ i 

lim — = c . 

x 

39. Let f be a function defined everywhere on the real axis, with a derivativef’ which satisfies 
the equation 

f (x) = cfix) for every x , 

where c is a constant. Prove that there is a constant K such that f(x) = Ke cx for every x. 
[Hint: Let g(x) = f(x)e~ cx and consider g’fxj.J 

40. Let f be a function defined everywhere on the real axis. Suppose also that f satisfies the 
functional equation 

(i) f(x+ l) = f{x)f(y) for all x and y . 

(a) Using only the functional equation, prove that f (0) is either 0 or 1. Also, prove that if 
f (0) 5 ^ 0 then f(x) ^ 0 for all x. 

Assume, in addition to (i), that /''(*) exists for all x, and prove the following statements: 

(b) f'(x)fiy) = fiy)f(x) for all x and y . 

(c) There is a constant c such that fix) = cfix) for all x. 

(d) f(x) = e cx if f (0) ^ 0. [Hint: See Exercise 39.] 

4L (a) Let f(x) = e x — 1 -» x for all x. Prove that f(x) > 0 if x >0 and f (x) < 0 if x <0. 
Use this fact to deduce the inequalities 

e* > 1 + x , e~ x > 1 — x , 

valid for all x > 0. (When x = 0, these become equalities.) 

Integrate these inequalities to derive the following further inequalities, all valid for x > 0: 

X ^ 

(b) e x > 1 + x + - , e~ x < 1 - x + - . 

X 2 X 3 X 2 X 3 

(c) <* > 1 + X + - + jy , e~ x > 1 - x + - - . 

(d) Guess the generalization suggested and prove your result. 

42. If n is a positive integer and if x > 0, show that 


/ X\n 


/ 

x\~ n 

{ 1+ -«)<**’ 

and that 

e x < 1 

‘--J 


By choosing a suitable value of n, deduce that 2.5 < e < 2.99. 
43. Let f(x, y) = x v where x > 0. Show that 


if x < n . 


d f 

i=y xV 


and 


8 f 

- =xM°gx. 
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6.18 The hyperbolic functions 

Certain combinations of exponential functions occur quite frequently in analysis, and 
it is worth while to give these combinations special names and to study them as examples 
of new functions. These combinations, called the hyperbolic sine (sinh), the hyperbolic 
cosine (cosh), the hyperbolic tangent (tanh), etc., are defined as follows: 


e x — p~ x e x + ee” 

sinh x = , cosh x = 


tanh x = 


sinh X _ e x — e x 
cosh x e x + e~ x ’ 


csch X = 


sinh x 


sech x = 


l 


cosh x 


coth x = 


1 


tanh X 




Figure 6.8 Graphs of hyperbolic functions. 


The prefix “hyperbolic” is due to the fact that these functions are related geometrically 
to a hyperbola in much the same way as the trigonometric functions are related to a circle. 
This relation will be discussed in more detail in Chapter 14 when we study the hyperbola. 
The graphs of the sinh, cosh, and tanh are shown in Figure 6.8. 

The hyperbolic functions possess many properties that resemble those of the trigonometric 
functions. Some of these are listed as exercises in the following section. 


6.19 Exercises 

Derive the properties of the hyperbolic functions listed in Exercises 1 through 15 and compare 
them, whenever possible, with the corresponding properties of the trigonometric functions. 

1. cosh 2 x — sinh 2 x = 1. 

2. sinh (-x) = -sinh x. 

3. cosh (-x) = cosh x. 

4. tanh (-x) = —tanh x 

5. sinh (x + y) = sinh x cosh y + cosh x sinh y. 

6. cosh (x + y) = cosh X cosh y + sinh x sinh y. 

7. sinh 2x = 2 sinh x cosh x 

8. cosh 2x = cosh 2 x + sinh 2 x. 

9. cosh x + sinh x = e x . 

10. cosh x — sinh x = e~ x , 

11. (cosh x + sinh x) n = cosh nx + sinh nx (n an integer). 

12. 2 sinh 2 |x = coshx — 1. 
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13. 2 cosh 2 1* = coshx + 1. 

14. tanh 2 x + sech 2 x = 1. 

15. coth 2 x - csch 2 * = 1. 

16. Find cosh x if sinh x = f . 

17. Find sinh x if cosh x = f and x > 0. 

18. Find sinh x and cosh x if tanh X = fs- 

19. Find cosh (x + y) if sinh x = 3 and sinh y = f . 

20. Find tanh 2x if tanh x = | . 

In Exercises 21 through 26, prove the differentiation formulas. 

21. D sinh x = cosh x 24. D coth x = — csch 2 x 

22. D cosh x = sinh x. 25. D sech X = —sech x tanh x. 

23. D tanh x = sech 2 x. 26. D cschx = —csch x coth x, 


6.20 Derivatives of inverse functions 

We have applied the process of inversion to construct the exponential function from the 
logarithm. In the next section, we shall invert the trigonometric functions. It is convenient 
at this point to discuss a general theorem which shows that the process of inversion transmits 
differentiability from a function to its inverse. 

theorem 6.7. Assume f is strictly increasing and continuous on an interval [a, b], and 
let g be the inverse off- If the derivative f ‘(x) exists and is nonzero at a point x in (a, b), 
then the derivative g’(v) also exists and is nonzero at the corresponding point y, where y = 
f(x). Moreover, the two derivatives are reciprocals of each other; that is, we have 

(6.42) g'(y) - — — , 

“ f’(x) 

Note: If we use the Leibniz notation and write y for f(x), dyjdx for f'(x), x for g(y), and 
dx/dy for g’(y), then Equation (6.42) becomes 

A. x 1 

M w 

which has the appearance of a trivial algebraic identity. 

Proof. Assume x is a point in (a, b ) where f’(x) exists and is nonzero, and let y = fix). 
We shall show that the difference quotient 

g(y + *0 - g{y) 

k 


approaches the limit 1 jf\x) as k -» 0. 

Let h = g(y + k) - g(y). Since x = g(y), this implies h = g(y + k) — x or X + h = 
g(y + k). Therefore y + k = f(x + h), and hence k = fix + h) -f(x). Note that 
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h ^ 0 if k 0 because g is strictly increasing. Therefore, if k ^ 0, the difference quotient 
in question is 


( 6.43) S(f * k) ~ = h - ! 

k fix + h)~ fix) = [/(x + h) - fix)]/h 

As k — >■ 0, the difference giy + k) — g(y) — > 0 because of the continuity of g at y [property 
(b) of Theorem 3.101. This means that h —*■ 0 as k —> 0. But we know that the difference 
quotient in the denominator on the extreme right of (6.43) approaches f’(x) as /i — »• 0 
[since f’(x) exists]. Therefore, when k — > 0, the quotient on the extreme left of (6.43) 
approaches the limit 1 lf\x). This proves Theorem 6.7. 


6.21 Inverses of the trigonometric functions 

The process of inversion may be applied to the trigonometric functions. Suppose we 
begin with the sine function. To determine a unique inverse, we must consider the sine 
over some interval where it is monotonic. There are, of course, many such intervals, for 


y 



F I G u R E 6 . 9 y = sin x. 


y 



example [— \tt, |tt], [^77,^77-], [—§77, etc., and it really does not matter which one of 

these we choose, It is customary to select [ — | 77 2 4tt] and define a new function /as follows : 


m = 


.„ 77 .77 

if < X < - . 

2 “ ”2 


The function / so defined is strictly increasing and it assumes every value between -1 
and + 1 exactly once on the interval [ — 3,77, ^ 77 ]. (See Figure 6.9.) Hence there is a uniquely 
determined function g defined on [- [, 1 ] which assigns to each number y in [- 1 , 1 ] that 
number x in [— 377 , ^ 77 ] for which y = sin x. This function g is called the inverse sine or 
arcsine, and its value at y is denoted by arcsin y, or by sin -1 y. Thus, 


u = arcsin v means v = sin U and — — < u < — . 

2 ~ “2 

The graph of the arc sine is shown in Figure 6.10. Note that the arc sine is not defined 
outside the interval [ — 1 , 1 ]. 
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The derivative of the arc sine can be obtained from formula (6.42) of Section 6.20. 
In this case we have f’(x) = cos x and this is nonzero in the open interval (— \tt, \tt). 
Therefore formula (6.42) yields 


g’(y) 



1 1 

cos* V 1 — s j n 2 x 



if — 1 < T < 1 


With a change in notation we can write this result as follows: 


(6.44) D arcsin x = , if — 1 < x < 1 . 

Vl - x 2 

Of course, this now gives us a new integration formula, 


(6.45) 



dt = arcsin x , 


which is valid for 1 < x < 1- 

Note: This formula may be used as the starting point for a completely analytic theory 

of the trigonometric functions, without any reference to geometry. Briefly, the idea is to 
begin with the arc sine function, defining it by the integral in (6.45), just as we defined the 
logarithm as an integral. Next, the sine function is defined as the inverse of the arc sine, 
and the cosine as the derivative of the sine. Many details are required to carry out this 
program completely and we shall not attempt to describe them here. An alternative 
method for introducing the trigonometric functions analytically will be mentioned in 
Chapter 11. 

In the Leibniz notation for indefinite integrals we may write formula (6.45) in the form 

(6.46) I = = arcsin x + C . 

_ x 2 

Integration by parts yields the following further integration formula: 

arcsin x dx = x arcsin * — I — = x arcsin x + Vl — x 2 + C . 

J J Vi _ x 2 

The cosine and tangent are inverted in a similar fashion. For the cosine it is customary 
to choose the interval [0, 77 ] in which to perform the inversion. (See Figure 6.11.) The 
resulting inverse function, called the arc cosine, is defined as follows: 

2.4 = arccos v means v = COS u and 0 < u < tt . 


The graph of the arc cosine function is shown in Figure 6.12. 
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y 



Figure 6.11 y = cos x. 


y 



To invert the tangent we choose the open interval \tt) (see Figure 6.13) and we 

define the arc tangent as follows: 


u = arctan v 


means v = tan u 


and < u < — . 

2 2 


Figure 6.14 shows a portion of the graph of the arc tangent function. 

The argument used to derive (6.44) can also be applied to the arc cosine and arc tangent 
functions, and it yields the following differentiation formulas: 


(6.47) 

validfor -1 < JC <C l,and 

(6.48) 

valid for all real x. 


D arccos x = 


-1 

Vl - x 2 ’ 


D arctan x : 


1 + x 2 
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When (6.47) is translated into an integration formula it becomes 


1 


= dt = — (arccos x — arccos 0) = ~ — arccos x 


if — 1 < x < 1. By comparing (6.49) with (6.45), we deduce the relation \n — arccos x = 
arcsin x. ( This may also be deduced from the familiar identity sin (J 7 r — y) = 
COS y if we write y = arcCOS x.) In the Leibniz notation for indefinite integrals, we may 
write (6.49) as follows: 


r dx 

J Vf- X 2 ~~ 


arccos x + c . 


Similarly, from (6.48) we obtain 


r g dt _ 

Jo 1 + 1 2 ~ 


arctan x or 


= arctan x + C . 


Using integration by parts in conjunction with (6.50) and (6.51), we can derive the 
following further integration formulas : 


f arccos x dx = x arccos x + I ~ x arccos x — V 1 — x 2 + C , 

J J V [ - x 2 

J arctan xdx = x arctan x — J = x arctan x - \ log (1 + x 2 ) + C . 

The inverses of the cotangent, secant, and cosecant can be defined by means of the 
following formulas : 

(6.52) arccot X = - — arctan X for all real X , 

2 


arcsec x = arccos - 
x 


arccsc x = arcsin 


when |x| > 1 , 


when |x| > 1 . 


Differentiation and integration formulas for these functions are listed in the following 


exercises. 


6.22 Exercises 

Derive the differentiation formulas in Exercises 1 through 5. 

1. Darccosx = ■ . 1 if - I < x < I . 

Vl -X 2 

1 

2. D arctan x = - „ 


for all real x. 
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3. Darccotx = for all real x. 

1 + x 1 


4. D arcsec x = 


5. D arccsc x = 


|x|Vx 2 - 1 

-1 


\x\Vx 


i f |*| > 1. 
i f |*| > 1. 


'x‘ -l 

Derive the integration formulas in Exercises 6 through 10. 

6. j arccot x dx = x arccot x + |log (1 + x 2 ) + C. 

7. f arcsec xdx — x arcsec x — — log lx + V x 2 — 1 1 + C. 

J |x| 


8. arccsc x dx = x arccsc x + — log \x + V x 2 — 1 1 + C. 

J 1*1 


9. J (arcsin x) 2 dx = xfarcsin x) 2 — 2x + 2yl— x 2 arcsin x + C. 


10 , 


'arcsin x 

J * 


dx = log 


i-viT 


arcsin x 


+ c. 


11. (a) Show that D( arccot x — arctan J = 0 for all x 0. 

(b) Prove that there is no constant C such that arccot x «■ arctan (1/x) = C for all x 7 s 0. 
Explain why this does not contradict the zero-derivative theorem (Theorem 5.2). 

In Exercises 12 through 25, find the derivative fix). In each case the functionfis assumed to be 
defined for all real x for which the given formula for f(x) is meaningful. 


12. f(x) = arcsin - . 


13. f(x) = arccos 


14. f(x) = arccos 


1 -X 


V5 ' 


1 


15. f(x) = arcsin (sin x). 

16. f(x) = Vx - arctan Vx. 

17. f(x) = arctan x + J arctan (x 3 ). 

1-x 2 


18. f(x) = arcsin 


19. f(x) = arctan (tan 2 x), 

20. f(x) = arctan (x + V 1 + x 2 ). 

21. f(x) = arcsin (sin x — cos x). 

22. f(x) = arccos V / 1 - x 2 . 

1 +x 

23. f(x) = arctan . 

1 — x 

24. f(x) = [arccos (x 2 )]' 2 . 

1 


1 +x 2 ' 


25. f(x) = log arccos 


(arccos ±) . 


26. Show that dyjdx = (x + j)/(x ™ y) if arctan ( yjx ) = log Vx 2 + y 2 . 

27. Compute d 2 yldx 2 if y = (arcsin x)/\/ 1 — x 2 for |x| < 1. 

28. Let fix) = arctan x — x + i* 3 . Examine the sign off’ to prove that 

x - — < arctan x if x > 0. 
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In Exercises 29 through 47, evaluate the indefinite integrals. 


I 
' I 


dx 


29 

30 

3 1 J 
3 2 

33 


V a 2 — x 2 
dx 


a ^0. 




v'l -2x 

dx 

a 2 + x 2 ’ 
dx 
a + bx 2 


a jtO. 
(ab^O). 


C dx 

’ J x 2 - x + 2 ' 

34. J x arctan x dx. 

35. J x 2 arccos x dx. 

36. J x(arctan x) 2 dx. 

37. J arctan yx dx. 

f dx 
J \/7x~^~aTl 


a)(b - x ) 


b^a. 


38 . 


dx. 


' arctan xfx 
Vx(l + x) 

39. JVT^ dx 

-arctan a; 

(j + , : JT d-v 


40 

41 

42 

43 
44. f : 


r x 2 

' J (1 + x : 


: i’ 


dx. 


+ e 2 * 
arccot d 


45 


J «* 

■J(^. 


dx. 


dx. 


[///«/; x = sin u.] 


’ v ^arctan 

_iSj= d x 

77 1 ^2 )3 / 2 • 


dx, a > 0. 


46. J VCr - a;(b - x) dx, b ^ a. 
[//;«?; x - a = (b ~ a) sin 2 «.] 


6.23 Integration by partial fractions 


We recall that a quotient of two polynomials is called a rational function. Differenti- 
ation of a rational function leads to a new rational function which may be obtained by 
the quotient rule for derivatives. On the other hand, integration of a rational function 
may lead to functions that are not rational. For example, we have 


jf=log w +c 


and 


/■ 


dx 

1 + x 2 


arctan x + C . 


We shall describe a method for computing the integral of any rational function, and we 
shall find that the result can always be expressed in terms of polynomials, rational functions, 
inverse tangents, and logarithms. 

The basic idea of the method is to decompose a given rational function into a sum of 
simpler fractions (called partial fractions) that can be integrated by the techniques discussed 
earlier. We shall describe the general procedure by means of a number of simple examples 
that illustrate all the essential features of the method. 


example 1. In this example we begin with two simple fractions, l/(x — 1) and l/(x + 3), 
which we know how to integrate, and see what happens when we form a linear combination 
of these fractions. For example, if we take twice the first fraction plus three times the 
second, we obtain 

2 3 _ 2(x + 3) + 3(x — l) 5x + 3 

x - 1 x + 3 (x — l)(x + 3 ) x 2 + 2x — 3 . 




Integration by partial fractions 


259 


If, now, we read this formula from right to left, it tells us that the rational function r given 
by r(x) = (5x + 3 )/(j>c 2 + 2x — 3) has been expressed as a linear combination of l/(x — 1) 
and l/(x + 3). Therefore, we may evaluate the integral of r by writing 

5x + - dx = 2 + 3 = 2 log fx - 1| + 3 log |x + 31 + c . 

+ 2x1 J x - 1 J x + 3 

example 2. The foregoing example suggests a procedure for dealing with integrals of 
the form J(ax + bjfx 1 + 2x — 3) dx. For example, to evaluate J(2x + 5)/(x 2 + 2x — 3) dx, 
we try to express the integral as a linear combination of 1 /(x — 1) and l/(x + 3) by writing 



(6.55) 


2x + 

X 2 + 2x —3 


X - 1 X + 3 


with constants A and B to be determined. If we can choose A and B so that Equation (6.55) 
is an identity, then the integral of the fraction on the left is equal to the sum of the integrals 
of the simpler fractions on the right. To find A and B. we multiply both sides of (6.55) by 
(x — l)(x + 3) to remove the fractions. This gives us 

(6.56) A(x + 3) + B(x - 1) = 2x + 5 , 


At this stage there are two methods commonly used to find A and B. One method is to 
equate coefficients of like powers of x in (6.56). This leads to the equations A + B = 2 
and 3 A — B = 5. Solving this pair of simultaneous equations, we obtain A = j and 
B = 1. The other method involves the substitution of two values of x in (6.56) and leads 
to another pair of equations for A and B. In this particular case, the presence of the factors 
x — 1 and x + 3 suggests that we use the values x = 1 and x = -3. When we put x = 1 
in (6.56), the coefficient of B vanishes, and we find 4A = 7, or ,4 = \. Similarly, we can 
make the coefficient of A vanish by putting x = — 3. This gives us -4B = — 1, or B =. J. 
In any event, we have found values of A and B to satisfy (6.55), so we have 


2x + 5 _ Z f dx 

. X 2 + 2x — 3 4 J x — 1 



dx 

x + 3 


= ^log \x - 1| + - log |x + 3| + C , 
4 4 


It is clear that the method described in Example 2 also applies to integrals of the form 
ff (x)/g(x) dx in which / is a linear polynomial and g is a quadratic polynomial that can be 
factored into distinct linear factors with real coefficients, say g(x) = (x — x^X — X 2 ). & 1 
this case the quotient / (x)lg(x) can be expressed as a linear combination of l/(x ™ x x ) and 
l/(x — x 2 ), and integration of f{x)lg(x) leads to a corresponding combination of the 
logarithmic terms log |x — x x | and log |x — x 2 |. 

The foregoing examples involve rational functions fig in which the degree of the 
numerator is less than that of the denominator. A rational function with this property 
is said to be a proper rational function. If fjg is improper, that is, if the degree off is not 
less than that of g, then we can express fjg as the sum of a polynomial and a proper rational 
function. In fact, we simply divide / by g to obtain 


fix) 

g(x) 


Q(X) + 


R(x) 
g(x) ’ 
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where Q and R are polynomials (called the quotient and remainder, respectively) such that 
the remainder has degree less than that of g. For example, 

X 3 + 3x lOx + 6 

— X + 2 + 

x 2 - 2x - 3 x 2 - 2x - 3 

Therefore, in the study of integration technique, there is no loss in generality if we restrict 
ourselves to proper rational functions, and from now on we consider §f(x)jg{x) dx, where 
f has degree less than that of g. 

A general theorem in algebra states that every proper rational function can be expressed 
as a finite sum of fractions of the forms 

A Bx + C 

r an d — ^ , 

(x + a) (x + bx + c)” 

where k and m are positive integers and A, B, C, a, b, c are constants with h 2 — 4c < 0. 
The condition b 2 — 4c < 0 means that the quadratic polynomial x 2 + bx + c cannot be 
factored into linear factors with real coefficients or, what amounts to the same thing, the 
quadratic equation x 2 + bx + c = 0 has no real roots. Such a quadratic factor is said to 
be irreducible. When a rational function has been so expressed, we say that it has been 
decomposed into partial fractions. Therefore the problem of integrating this rational 
function reduces to that of integrating its partial fractions. These may be easily dealt with 
by the techniques described in the examples which follow. 

We shall not bother to prove that partial-fraction decompositions always exist. Instead, 
we shall show (by means of examples) how to obtain the partial fractions in specific 
problems. In each case that arises the partial-fraction decomposition can be verified 
directly. 

It is convenient to separate the discussion into cases depending on the way in which the 
denominator of the quotient f{x)jg{x) can be factored. 

CASE 1. The denominator is a product of distinct linear factors. Suppose that g(x) splits 
into n distinct linear factors, say 

g(x) = (x — Xj)(x — x 2 ) . ■ . (x — x,) . 

Now notice that a linear combination of the form 



may be expressed as a single fraction with the common denominator g(x), and the numerator 
of this fraction will be a polynomial of degree < n involving the A 's. Therefore, if we can 
find A’s to make this numerator equal to J{x), we shall have the decomposition 

f(x) _ Aj f A n 

g(x) x - Xi + X - X n ’ 
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and the integral of f{x)lg(x) will be equal to log I* — Jf,|. In the next example, we 

work out a case with n = 3. 

C 2x‘ * 5 jc ~~ 1 

example 3. Integrate 1 dx , 

J x 3 + x 2 — 2x 


Solution. Since x 3 + x 2 — 2x = x(x — l)(x + 2), the denominator is a product of 
distinct linear factors, and we try to find A,, A„ and A, such that 


2x 2 -I - 5x — _1 A 3 + A2 + 

x 3 + x 2 — 2x x X - I x + 2 

Clearing the fractions, we obtain 

2x2 + 5x — 1 = A x (x — I)(x + 2) + A 2 x(x + 2) + A 3 x(x — 1). 

When x = 0, we find — 2A X r= — 1, so A, = When x =1, we obtain 3 A 2 = 6, A. == 2, 
and when x = -2, we find 6 A 3 = -3, or A 3 ~ — Therefore we have 


f 2x 2 + 5x — 1 , 1 f dx , . f dx 1 f dx 

1 dx - 1 \ T + 2 J —1 - 2 1 7T2 

= \ log |x| + 2 log ]x — 1) — i log |x + 2| + C. 

CASE 2. The deenominator is a product of linear factors, some of which are repeated. W e 
illustrate this case with an example. 


example 4. Integrate 


x2 + 2x ±Ld X . 


(x - l)(x l) 2 

Solution. Here we try to find A,, A,. A, so that 
X 2 + 2x + 3 A 1 A 


(6.57) 


(x — l)(x + 1)” = x— l+x + 1 


T + 


(x + l) 2 • 


We need both Aj(x + 1) and A 3 /(x + l) 2 as well as /tj/(x — 1) in order to get a polynomial 
of degree two in the numerator and to have as many constants as equations when we try 
to determine the A’s\ Clearing the fractions, we obtain 

(6.58) x 2 + 2x + 3 = A^x + l) 2 + A 2 (x - l)(x + 1) + A 3 (x - 1) . 


Substituting x = 1, we find 4 A 1 = 6, so A, = f. When x = — 1, we obtain —2 A 3 = 2 
and A, = — 1. We need one more equation to determine A,. Since there are no other 
choices of X that will make any factor vanish, we choose a convenient x that will help to 
simplify the calculations. For example, the choice x = 0 leads to the equation 3 = A, — 
A, — A, from which we find A, = — J. An alternative method is to differentiate both 
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sides of (6.58) and then substitute a convenient x. Differentiation of (6.58) leads to the 
equation 

2x + 2 = 2 Afx + 1 ) + A 2 (x — 1 ) + A 2 (x + 1) + A $ , 


and, if we put x = — 1, we find 0 = —2A 2 + A„ so A, = iA :i = — as before. Therefore 
we have found A’s to satisfy (6.57), so we have 


r x 2 + 2x + 3 __ 3 r dx i r dx r dx 

J (x — l)(x + l) 2 2 J x — 1 2 J x + 1 J (x + l) 2 

= - log k — 1 1 — - log |x + 1| + — j + C . 
2 2 1 x + 1 


If, on the left of (6.57), the factor (x + l) 3 had appeared instead of (x + l) 2 , we would 
have added an extra term AJ(x + l) 3 on the right. More generally, if a linear factor 
x + a appears p times in the denominator, then for this factor we must allow for a sum 
ofp terms, namely 


(6.59) 


t 


Ak 

(X + af ’ 


where the A’s are constants. A sum of this type is to be used for each repeated linear factor. 


CASE 3. The denominator contains irreducible quadratic factors, none of which are 
repeated. 


example 5. Integrate 


3x 2 +2x -2 


dx . 


Solution. The denominator can be split as the product x 3 — 1 = (x — l)(x 2 + x + 1), 
where x 2 + x + 1 is irreducible, and we try a decomposition of the form 


3x 2 + 2x — 2 A Bx + C 

x 3 - 1 - 1 X 2 + X + 1 ' 

In the fraction with denominator x 2 + X + 1, we have used a linear polynomial Bx + C 
in the numerator in order to have as many constants as equations when we solve for A, B, 
C. Clearing the fractions and solving for A, B, and C, we find A = 1, B = 2, and C = 3. 
Therefore we have 


/ 


3x 2 + 2x - 2 

X 3 - 1 


dx 


-j—.+S 


2x + 3 

X 2 -p X “p 1 


dx 


The first integral on the right is log |x — 1[. To evaluate the second integral, we write 


f 2x+ 3 , f 2x + 1 , f 

Jt t 7777 i = !7T7Ti dx + J 


+ x + 1 

log (x” + X + 1) + 2 


x + x + 1 
dx 


- dx 




(x + W + i . 
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If we let u = x + J and oc = 


J 


du 

2 , 2 
W -j - OC 


the last integral is 

= - arctan - = -V3 arctan 
a a 3 


2x +1 
V3 


Therefore, we have 


3x 2 x 3 + — 2x1 — 2 <£c = log |x — lj + log (x 2 + x + 1) + - \/3 arctan — a/3-S^" 

3 


CASE 4. The denominator contains irreducible quadratic factors, some of which are 
repeated. Here the situation is analogous to Case 2. In the partial-fraction decomposition 
°f f(x)/g(x) we allow, first of all, a sum of the form (6.59) for each linear factor, as already 
described. In addition, if an irreducible quadratic factor _\' 2 + bx + c is repeated m times, 
we allow a sum of m terms, namely 


f B k x + C k 
(x 2 + bx + c) k ’ 


where each numerator is linear. 


example 6. Integrate 


— x 3 + 2x2 — x + 2 
(x - l)(x 2 + 2) 2 


■ dx . 


Solution. We write 


x 4 - x 3 + 2x 2 - x +2 


A _j_ B.x .+ x Dx + E 


(x - l)(x 2 + 2) 2 x - I ' x 2 + 2 + (x 2 + 2 ) 2 

Clearing the fractions and solving for A, B, C, D, and E, we find that 

A = h B = I, C= D = - 1 , E=0. 

Therefore, we have 


x 3 + 2x 2 - x + 2 


(x-.x^v '-“-ilrh+ltt*-!: 

= i r dx if 
_ 3Jx— l'3J 


x dx 


+ 

2x dx 


1 


x" + 2 

1 


(x 2 + 2) 2 

i r dx _i r 
3 J X 2 + 2 2 J 


= - log |x - 1| + - log(x 2 + 2) - 




arctan 


2x dx 
(x 2 + 2) 2 

x 

V2 


+ 


1 1 

2 x 2 + 2 


+ C 
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The foregoing examples are typical of what happens in general. The problem of inte- 
grating a proper rational function reduces to that of calculating integrals of the forms 


J 


dx 

if + a) 



x dx 

+ bx + c) 


and 



dx 

+ bx + c) 


The first integral is log \x + a\ if n = 1 and (x + a) 1_n /(l — n) if n > 1. To treat the other 
two, we express the quadratic as a sum of two squares by writing 


x 2 + bx + c = ( x + $ ) 2 + j = u 2 + a 2 , 

where u = x + bj2 and a = l\' 4c — b 2 . (This is possible because 4c — b 2 > 0.) The 
substitution u = x + bj 2 reduces the problem to that of computing 


(6.60) ju_du ^ f fu 

J (u 2 + a 2 )” 1 J (u 2 + a 2 ) m ■ 

The first of these is \ log (w 2 + a 2 ) if m = 1, and i(« 2 + a 2 ) 1 ' m /(l — m) if m > 1. When 
m = 1, the second integral in (6.60) is evaluated by the formula 


f du 1 . u „ 

— = - arctan - + C . 

■ ' u 2 + a 2 a a 

The case m > 1 may be reduced to the case m = 1 by repeated application of the recursion 
formula 

f du ! u 2m —3 ( du 

J (u 2 + a. 2 )™ = 2a \m - 1) ( u 2 + a 2 ) m 1 + 2a 2 (m - 1) J (u 2 + a 2 )™" 1 ’ 

which is obtained by integration by parts. This discussion shows that every rational 
function may be integrated in terms of polynomials, rational functions, inverse tangents, 
and logarithms. 


6.24 Integrals which can be transformed into integrals of rational functions 

A function of two variables defined by an equation of the form 

P(x, y) = X 2 a m<n x m /‘ 

M =0 «=0 

is called a polynomial in two variables. The quotient of two such polynomials is called a 
rationed function of two variables. Integrals of the form J/?( sin x, cos x) dx, where R is a 
rational function of two variables, may be reduced by the substitution u = tan £.v to 
integrals of the form j r(u) du where r is a rational function of one variable. The latter 
integral may be evaluated by the techniques just described. We illustrate the method with 
a particular example. 

example 1. Integrate ! dx . 

J sin x + cos x 
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Solution. The substitution u = tan | x gives us 


x = 2 arctan u 


dx = 


1 + u 2 


*du , 


~ . x x 2 Unix 2 u 

sin x = 2 sin - cos -= — -r-f— — \ „ , 

l 


2 2 sec 2 |x 1 + u 2 


COS x = 2 cos 2 - - I = 

2 sec 2 |x 




and 


sin x + cos x 


1 + u 2 

2u + 1 — it 2 
1 + u 2 . 


1 +n 2 ’ 


Therefore, we have 


f dx 

if du 

r du 

J sin x + cos x 

J u 2 - 2u - 1 

J (u — a)(u — b) 


where a = 1 + V2 and b = 1 V2. The method of partial fractions leads to 

du 1 1 1 - , 

du 


(u — a)(u — b) a - b s jut — at u - b 

and, since a — b = 2v / 2, we obtain 
dx 


(6.61) 


h 


V2 

- = — log 

sin x + cos x 2 


u — b V2 tan ix — 1 + V2 

+ C = — log - — -t= + C . 

u — a 2 tan \x — \ — V2 


The final answer may be simplified somewhat by using suitable trigonometric identities. 
First we note that \/2 — 1 = tan 57 r SO the numerator of the last fraction in (6.61) is 
tan !,x + tan |7r. In the denominator we write 


tan — — 1 — V2 = (V2 + 1) (V2 — 1) tan 1 1 = (xfl + 1) 1 — tan - tan 

2 1 v 2 2 

Taking logarithms as indicated in (6.61), we may combine the term — ) y'2 log (\/2 + 1) 
with the arbitrary constant and rewrite (6.61) as follows: 


dx 




log 


. X 7 T 

tan - + - 
2 8 


sin x + cos x 2 
In an earlier section we derived the integration formula 

dx 


+ C 


f dx 

J 


arcsin x 


00 1 ^ 
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as a consequence of the formula for differentiating arcsin x. The presence of arcsin x 
suggests that we could also evaluate this integral by the trigonometric substitution 
t = arcsin X. We then have 

x = sin t, dx = cos t dt, V 1 — X 2 = V 1 — sin 2 1 = COS t , 
and we find that 

f dx f cos t dt f , 

= \ dt = t = arcsin x . 

J V 1 — X 2 J COS t J 


This is always a good substitution to try if the integrand involves y l — x 2 . More 
generally, any integral of the form §R(x, V a 2 — x 2 ) dx, where R is a rational function of 
two variables, can be transformed by the substitution 


x = a sin t, dx = a COS t dt , 


into an integral of the form jj R(a sin t, a cos t)a COS t dt. This, in turn, can always be 
integrated by one of the methods described above. 


example 2. Integrate 


L 


x dx 

x 2 + vT 


Solution. We let x = 2 sin t, dx =2 cos t dt, V 4 — x 2 = 2 cos t, and we find that 


x dx 

r 4 sin t cos t dt j 

sin t dt 

' 4 - x 2 + ^4 - x 2 - 

' 4 cos 2 t + 2C0St~J 

cost + | 


= —log || + COS f| + c = —log (1 + ^4 — x 2 ) + C 
The same method works for integrals of the form 


J R(x, T' a 2 — (cx + df) dx ; 

we use the trigonometric substitution cx + d = a sin t. 

We can deal similarly with integrals of the form 


j" R(x, a 2 + (cx + df) dx 

by the substitution cx + d = a tan t, c dx = a sec 2 t dt. For integrals of the form 

J" R(x, V (cx + df — a”) dx , 

we use the substitution cx + d = a sec t, c dx = a sec t tan t dt. In either case, the new 
integrand becomes a rational function of sin t and cos t. 
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6.25 Exercises 


Evaluate the following integrals: 
2x + 3 


1. 

2 

3. 

4. 


(x - 2)(x + 5) 

x dx 


dx. 




l)(x + 2)(x + 3) 

x dx 


X d — 3x + 2 
x i + 2x - 6 
X 3 + X 2 - 2x 


dx. 




, 8x 3 + 7 

5. It — 7TT) - dx , 


6 . 




(x + l)(2x + l) 3 
4x2 +x + 1 

x 3 — 1 dX ■ 
x 4 dx 

X 4 + 5x2 + 4 . 
x + 2 


dx. 


X 4 +x 

dx 

x(x2 + l) 2 ' 


■Ji 

Ji 


dx 


(x + l)(x + 2) 2 (x + 3) 3 ' 

x dx 


12 . 

13. 

14. 

15. 

16. 

17. 

18. 
19. 


(x + l) 2 ' 
dx 

x 2 dx 

x 2 + x - 6 ' 

1 (x + 2) dx 
x 2 — 4x + 4 . 


JW 
J 


dx 

4x + 4)(x 2 - 4x + 5) 


(x - 3) dx 
x 3 + 3x2 + 2x ' 

dx 


(x 2 - l) 2 ' 


J 

1 ^ 7 - 

r x 4 + i 
J x(x 2 + l ) 2 


dx. 


2 0 
21 . 
22 . 

23. 

24. 


dx 


X 4 - 

2x 3 ' 

' 1 ~ 

•x 3 

x(x 2 

+ 1)' 

dx 


X 4 - 

1 . 

dx 



dx. 


X 4 + 1 




dx 


i 2 ' 


2x + 2) ! 


. 4x 5 — 1 

25. | — —dx. 


26 

27 

28 
29. 


J 

h 

Jr 

J 


(x 5 + x + l) 2 

dx 

sin x — cos x + 5 

dx 


+ a COS X 

dx 


1 + a COS x 


(0 < ; 
(a > 1 


30. 

31. 

32. 


sin 4 x 
1 + sin 2 x 

dx 


dx. 


J a 2 si 


sin 2 x + b 2 cos 2 


dx 


(a sin x + b COS x) 2 

irl 2 


dx 


1 + COS x + sin x ' 


33 
3 4 


. A/3 

v 


• x 2 dx. 


A/3 


: dx. 


J- 


35 I 




3 7 

38. 


J^ 


x- +5 dx . 


V X 2 + X + 1 


dx. 


< 1 ). 

(ab * 0). 

(a*0). 



268 


The logarithm, the exponential, and the inverse trigonometric jiunctions 


39. 


J 


dx 

V x 2 + X ' 


40. 



dx. 


[Hint: In Exercise 40, multiply numerator and denominator by x/2 — x — x 2 .] 


6.26 Miscellaneous review exercises 

1. Let f(x) = J* (log t)j(t + 1) clt if x > 0. Compute f(x) + f(l/x). As a check, you should 
obtainf(2) + f{\) = \ log 2 2. 

2. Find a function f continuous for all x (and not everywhere zero), such that 


f”(x) = 



sin t 

; dt . 

+ cos t 


3. Try to evaluate §e x jx dx by using integration by parts. 

4. Integrate Jj /2 log (e C08a: ) dx. 

5. A function f is defined by the equation 


/(*) = 


4x + 2 


x(pc + l)(x + 2) 


if x > 0 . 


(a) Find the slope of the graph off at the point for which x = 1. 

(b) The region under the graph and above the interval [1, 4] is rotated about the x-axis, thus 
generating a solid of revolution. Write an integral for the volume of this solid. Compute this 
integral and show that its value is tt log (25/8). 

6. A function Fis defined by the following indefinite integral: 


F(x) = 


j; 



if x > 0 . 


(a) For what values of x is it true that log x<, F(x)? 

(b) Prove that + a) dt = e~ a [F{x + a) — F(1 + a)]. 

(c) In a similar way, express the following integrals in terms of F: 


e at 

’* e t 

r 

1 7 * 

72 dt* 
Jl 1 

J. 


e llt dt . 


7. In each case, give an example of a continuous functionfsatisfying the conditions stated for all 
real x, or else explain why there is no such function: 

(a) $* 0 f(t) dt = e\ 

(b) [ff{t) dt = 1 - 2 X \ [ 2 X 2 means 2 ( * 2) .] 

(c) J %f(t)dt =/ 2 (x)- 1. 

8. If fix + y) = f(x)f(y) for all x and y and if f(x) = 1 + xg(x), where g(x) — » 1 as x — > 0, 
prove that (a) f'(x) exists for every x, and (b) fix) = e x . 

9. Given a functiong which has a derivativeg’(x) for every real x and which satisfies the following 
equations : 

g’(0) = 2 and g(x + y) = e v g{x) + e x g(y) for all x and y . 

(a) Show that gilx) = 2 e x g(x) and find a similar formula for ^(3x). 

(b) Generalize (a) by finding a formula relating g{nx) to g(x), valid for every positive integer 
n. Prove your result by induction. 



Miscellaneous review’ exercises 


269 


(c) Show that g(0) = 0 and find the limit of g{h)lh as h — > 0. 

(d) There is a constant C such that g'{x) = g(x) + Ce 1 for all x. Prove this statement and 

find the value of C. [Hint: Use the definition of the derivative g’(x).] 

10. A periodic function with period a satisfies fix + a) = fix) for all x in its domain. What can 
you conclude about a function which has a derivative everywhere and satisfies an equation of 
the form 

fix + a) = bfix ) 


tor all x, where a and b are positive constants? 

11. Use logarithmic differentiation to derive the formulas for differentiation of products and 
quotients from the corresponding formulas for sums and differences. 

12. Let A = efit + 1) dt. Express the values of the following integrals in terms of A: 


f*a e t 

' f e * 

(a) . n , dt. 

Jo-i t - a - \ 

(c) J, (» + n d ‘ ■ 

f 1 fe (2 

f 1 , 

(b) + 

(d) J e* log (1 + t) dt. 


13. Let p(x) = c 0 + c x x + c 2 x 2 and let fix) = e x p(x). 

(a) Show that / <n) ( 0), the nth derivative off at 0, is c 0 + nc 1 + n{n — 1 )c 2 • 

(b) Solve the problem when p is a polynomial of degree 3. 

(c) Generalize to a polynomial of degree m. 

14. Let f(x) = x sin ax. Show that f {2n \x) = ( — 1 ) n (d 2n x sin ax — 2na 2 "“ 1 cosax). 

15. Prove that 


n 

2 <- 


■i)* 


l 


kjk + m + 


Tr — n X ' 


[Hint: 1 lik + m + 1) = t k+m dt.] 


16. Let F(x) = j'j/(f) dt. Determine a formula (or formulas) for computing F(x) for all real x 
if f is defined as follows: 

(a) /(f) = (t + |(|) 2 . (c)f(t)=e~ I*'. 

(b) f(t) = jj _ | f | ^ > I’ (d) f(t) = the maximum of 1 and t 2 . 


17. A solid of revolution is generated by rotating the graph of a continuous function /around 
the interval [0, a] on the x-axis. If, for every a > 0, the volume is a 2 + a, find the function /. 

18. Let f(x ) = e 2x for all x. Denote by Sit) the ordinate set off over the interval [0, t], where 
t > 0. Let A(t) be the area of Sit), V(t) the volume of the solid obtained by rotating Sit) 
about the x-axis, and W(t) the volume of the solid obtained by rotating 5(f) about the y-axis. 
Compute the following: (a) Aft); (b) V(t); (c) W(t); (d) lim ( ^ 0 V(t)lA(t). 

19. Let c be the number such that sinh c = f. (Do not attempt to compute c.) In each case 
find all those x (if any exist) satisfying the given equation. Express your answers in terms of 
log 2 and log 3. 

(a) log (e* + \/ e u + 1) = c. (b) log [e x — V e ix — 1) = c. 

20. Determine whether each of the following statements is true or false. Prove each true statement. 


(a) 2 1o 8 5 _ 5lOg 2. 


(b) log 2 5 


logs 5 
1°%q3 • 


(c) V fc _1/2 < 2 V n for every n > 1. 


(d) 1 + sinh x < cosh x for every x. 
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In Exercises 21 through 24, establish each inequality by examining the sign of the derivative of 
an appropriate function. 

2 77 

21. - x < sinx <x if 0 < x < - . 
it 2 


22 'jTj <l06 ( 1+ ;K ,fx>a 

23. x « — < sin x < x if x > 0. 

6 

24. ( x b + y b ) l/b < (x a + y a ) l/a if x > 0, y > 0, and 0 < a < b . 

25. Show that 


(a) j* e~ l t dt = - 1 - x). 



'x 

( X 2 ' 

(b) 

j 

e~*t 2 dt — 2 \e~ x | 
e 

I**" 1 “*-21, 

(c) 

V 

r* .. i 

( X 2 

e ‘r 3 dt — 3!e * 

C* - 1 - X — - 

0 

t 2! 


(d) Guess the generalization suggested and prove it by induction. 

26. If a, b, a„ b 1 are given, with ab ^ 0, show that there exist constants A, B, C such that 


’a, sin x + bi cos x 

: dx = Ax + B log |a sin x + b cos xl + C. 

J a sin x + b cosx ° ' 


[Hint: Show that A and B exist such that 

a, sin x + b x cos x = A(a sin x + b cos x) + B(a cos x — b sin x).] 


27. In each case, find a function f satisfying the given conditions. 

(a) f'(x 2 ) = 1 lx forx > 0, /( 1) = 1. 

(b) y'(sin 2 x) = cos 2 x for all x, f( 1) = 1. 

(c) /'(sin x) = cos 2 x for all x, /( 1) = 1. 


(d) /'(log x) = 


for 0 < x <, 1, 
for x > 1, 


f(0) = 0. 


28. A function, called the integral logarithm and denoted by Li, is defined as follows: 


f* dt 

L i ( x ) 4 = if x >2. 

Jt log! 


This function occurs in analytic number theory where it is proved that Li(x) is a very good 
approximation to the number of primes < x Derive the following properties of Li(x) : 


(a) U(X) = ~x + 


’n 

V2 


dt 2 

log 2 t log 2 


(b) Li(x) 


V k\x 
+2_c—tzr~ + 


log X ^ log fc+1 X 


’ x 

n\ 

J 2 


dt 


log 1 


r n+l 


+ c n , 


where C„ is a constant (depending on n). Find this constant. 

(c) Show that there is a constant b such that jl°8 * e f /t dt = Li(x) and find the value of b. 

(d) Express J* e 2t /(t — 1) dt in terms of the integral logarithm, where c = 1 + J log 2. 
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(e) Let f{x) = e 4 Li(e 2x 4 ) — e 2 Li(e 2x ~ 2 ) if x > 3. Show that 

e 2x 

/'(*)= x 2 - 3 a + 2 ■ 

29. Let f(x) = log \x\ if x < 0. Show that f has an inverse, and denote this inverse by g. What 
is the domain ofg? Find a formula for computing g(y) for each y in the domain ofg. Sketch 
the graph of g, 

30. Let/(x) = jg(l + t 3 r 1!2 dt if x >0. (Do not attempt to evaluate this integral.) 

(a) Show that f is strictly increasing on the nonnegative real axis. 

(b) Let g denote the inverse of f Show that the second derivative of g is proportional to g 2 
[that is, g"{)’) = c g 2 () ! ) for each y in the domain of g] and find the constant of proportionality. 
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POLYNOMIAL APPROXIMATIONS TO FUNCTIONS 


7.1 Introduction 

Polynomials are among the simplest functions that occur in analysis. They are pleasant 
to work with in numerical computations because their values may be found by performing 
a finite number of multiplications and additions. In Chapter 6 we showed that the logarithm 
function can be approximated by polynomials that enable us to compute logarithms to any 
desired degree of accuracy. In this chapter we will show that many other functions, such 
as the exponential and trigonometric functions, can also be approximated by polynomials. 
If the difference between a function and its polynomial approximation is sufficiently small, 
then we can, for practical purposes, compute with the polynomial in place of the original 
function. 

There are many ways to approximate a given function / by polynomials, depending on 
what use is to be made of the approximation. In this chapter we shall be interested in 
obtaining a polynomial which agrees with / and some of its derivatives at a given point. 
We begin our discussion with a simple example. 

Supposefis the exponential function,f(x) = e x . At the point x = 0, the function f and 
all its derivatives have the value 1. The linear polynomial 

g(x) = \ + X 

also has g(0) = 1 and g’(O) = 1, so it agrees withfand its first derivative at 0. Geometrically, 
this means the graph ofg is the tangent line of/at the point (0, 1), as shown in Figure 7.1. 

If we approximate f by a quadratic polynomial Q which agrees with f and its first two 
derivatives at 0, we might expect a better approximation to f than the linear function g, at 
least near the point (0, 1). The polynomial 

Q(x) = 1 + x + |x 2 

has Q(0) = Q’(0) = 1 and Q”(0) = f “(0) = 1. Figure 7.1 shows that the graph of Q 
approximates the curve y = e x more closely than the line y = 1 + x near the point (0, 1). 
We can improve further the accuracy of the approximation by using polynomials which 
agree withf in the third and higher derivatives as well. It is easy to verify that the polynomial 

™ r(*) = 2- = 1 + x + *- + --' + - 

k=0 
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Figure 7 . 1 Polynomial approximations to the curve y = e x near (0, 1). 

agrees with the exponential function and its first n derivatives at the point x = 0. Of 
course, before we can use such polynomials to compute approximate values for the 
exponential function, we need some information about the error made in the approximation. 
Rather than discuss this particular example in more detail, we turn now to the general 
theory. 


7.2 The Taylor polynomials generated by a function 

Suppose f has derivatives up to order n at the point x = 0, where n > 1, and let us 
try to find a polynomial P which agrees withfand its first n derivatives at 0. There are n + 1 
conditions to be satisfied, namely 

(7.2) P{ 0) = /(0) , P’(0) = /'( 0), . . . , PW( 0) =/<">(0) , 

so we try a polynomial of degree n, say 

(7.3) PM = c 0 + c x X + c 2 x 2 + . . . + c n x n , 

with n + 1 coefficients to be determined. We shall use the conditions in (7.2) to determine 
these coefficients in succession. 

First, we put x = 0 in (7.3) and we find P(0) = C 0 ■ so c 0 = /( 0). Next, we differentiate 
both sides °f (7.3) and then substitute x = 0 once more to find P'(0) = ; hence £T = /'( 0). 
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If we differentiate (7.3) again and put x = 0, we find that P"(0) = 2 c 2 , SO c 2 =/ "(0)/2. 
After differentiating k times, we find that P ik) (0) — k! c k , and this gives us the formula 


(7.4) 


c k = 


/ w (0) 

k\ 


for k = 0, 1 , 2 j . . . , n. [When k = 0, we interpret / (0, (0) to mean / (0).] This argument 
proves that if a polynomial of degree < n exists which satisfies (7.2), then its coefficients 
are necessarily given by (7.4). (The degree of P will be equal to ft if and only iff (n) (0) ^ 0.) 
Conversely, it is easy to verify that the polynomial P with coefficients given by (7.4) satisfies 
(7.2), and therefore we have the following theorem. 


theorem 7.1. Let f be a function with derivatives of order n at the point x s 0. Then 
there exists one and only one polynomial P of degree < n which satisfies the tl + 1 conditions 

p(0)=f(0), P’(0) =/'(0), ..., P {n) (0) =f (n) (0) • 

This polynomial is given by the formula 


v/n o) * 

Z k\ x ■ 

k=0 

In the same way, we may show that there is one and only one polynomial of degree < n 
which agrees with / and its first n derivatives at a point % - a. In fact, instead of (7.3), we 
may write P in powers of x — a and proceed as before. If we evaluate the derivatives at a 
in place of 0, we are led to the polynomial 



(7.5) 


n Ak)/ \ 

k=0 


— a)“. 


This is the one and only polynomial of degree < n which satisfies the conditions 


P(a) = f(a) , P'(a) = f'(a\ ..., P<»>(a) = / , 

and it is referred to as a Taylor polynomial in honor of the English mathematician Brook 
Taylor (1685-1731). More precisely, we say that the polynomial in (7.5) is the Taylor 
polynomial of degree ft generated by f at the point a. 

It is Convenient to have a notation that indicates the dependence of the Taylor polynomial 
P on / and n. We shall indicate this dependence by writing P = T n f or P = T,(f). The 
symbol T n is called the Taylor operator of degree n. When this operator is applied to a 
function fi it produces a new function T n f the Taylor polynomial of degree n. The value 
of this function at x is denoted by T n f(x) or by T n [f(x)\. If we also wish to indicate the 
dependence on a, we write T n f( x l a ) instead of T n f(x). 

example 1. When / is the exponential function, fix ) = E(x) = e x ,we have E w (x) = e x 
for all k, so E tk> (0) — e° = 1, and the Taylor polynomial of degree n generated by £ at 0 
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is given by the formula 

n y 2 

T n E(x) = T n (e x ) = Y- = l + x + *- + -- - + - 
k\ 2: nl 

7c=0 

If we want a polynomial which agrees with E and its derivatives at the point a = 1, we 
have £' (fc) (l) = e for all k, SO (7.5) gives us 

n 

T n E(x ; l)= 2f' (X " 1} *' 

k=0 1 

EXAMPLE 2. When fix) = sin x, we have f (x) = cos x, f'(x) = — sin x,f'"(x) = — cos x, 
/ (4) (x) = sin x, etc., SO / (2 " +1) (0) = (-- 1)” and / (2n) (0) = 0. Thus only odd powers of x 
appear in the Taylor polynomials generated by the sine function at 0. The Taylor polynomial 
of degree 2n + f has the form 


7^+i(sin x) = x 


X 3 X 5 X 1 x 2n+1 

— + + ...+(_!)» • 

3! 5!- 71 V ’ {In + 1)! 


example 3. Arguing as in Example 2, we find that the Taylor polynomials generated 
by the cosine function at 0 contain only even powers of x. The polynomial of degree 2n 
is given by 


T u ( cos x) = 1 




+ (- 1 )” 


(2 n)! 


Note that each Taylor polynomial T 2n ( cos x) is the derivative of the Taylor polynomial 
r 2n+ i(sin x). This is due to the fact that the cosine itself is the derivative of the sine. In 
the next section we learn that certain relations which hold between functions are transmitted 
to their Taylor polynomials. 


7.3 Calculus of Taylor polynomials 


If a function f has derivatives of order n at a point a, we can always form its Taylor 
polynomial T n f by the formula 




a)‘ 


Sometimes the calculation of the derivatives may become lengthy, so it is desirable 

to have alternate methods for determining Taylor polynomials. The next theorem describes 
properties of the Taylor operator that often enable us to obtain new Taylor polynomials 
from given ones. In this theorem it is understood that all Taylor polynomials are generated 
at a common point a. 
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theorem 7.2. The Taylor operator T n has the following properties: 

(a) Linearity property. If c x and c 2 are constants, then 

TfCif + c 2 g) = c\Tff) + c 2 Tfg) . 

(b) Differentiation property. The derivative of a Taylor polynomial off is a Taylor 
polynomial off ' ; in fact, we have 

(Tjy = ruf) . 

(c) Integration property. An indejnite integral of a Taylor polynomial off is a Taylor 
polynomial of an indefinite integral off. A4ore precisely, if g(x) = §*f(t) dt, then we 
have 

T n+ ig(x) = /’ TJ(t) dt . 

Proof. Each statement (a), (b), or (c), is an equation involving two polynomials of the 
same degree. To prove each statement we simply observe that the polynomial which 
appears on the left has the same value and the Same derivatives at the point a as the one 
which appears on the right. Then we invoke the uniqueness property of Theorem 7.1. 
Note that differentiation of a polynomial lowers its degree, whereas integration increases 
its degree. 

The next theorem tells us what happens when we replace x by cx in a Taylor polynomial. 


theorem 7.3. substitution property. Let g( x ) = f{cx). where c is a constant. Then 

we have 

g(x , a) — T n f(cx . ca) . 

Zn particular, when a = 0, we have T n g(x ) = T n f(cx). 

Proof. Since g(x) = f(cx), the chain rule gives us 

g'(x) = cf fcx) , g”(x) = c 2 f"(cx), ..., g’“‘(x) = c k f m (cx). 

Hence we obtain 


T n g(x ; a ) = JS-U (x - af = V (cx - caf = TJ(cx ; ca) , 

J—t kl, kl, 


examples. Replacing x by — x in the Taylor polynomial for e* we find that 

v 2 y 3 y w 

T n (e- x ) = l-x + ^~f + '- + (- 1 )” *- 

2! 3! n! 


Since cosh X = \e x + \e x , we may use the linearity property to obtain 
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The differentiation property gives us 

T 2 i(sinh x) = x + — + — + • 
3! 5! 


x 8 "" 1 

(2n - ])! ' 


The next theorem is also useful in simplifying calculations of Taylor polynomials. 


theorem 7 . 4 . Let P n be a polynomial of degree n > l. Let f and g be two functions 
with derivatives of order n at 0 and assume that 

(7.6) fix ) = PM + x H g(x) , 

where g(x) -> 0 as x 0. Then P n is theTaylor polynomial generated by f at 0. 

Proof. Let h(x) = fix j — P,(x) = x“g(x). By differentiating t h e product x n g(x) 
repeatedly, we see that h and its first n derivatives are 0 at x = 0. Therefore, f agrees with 
P n and its first n derivatives at 0, so P„ = T n f as asserted. 

examples . From the algebraic identity 

(7.7) = l + x+^+...+ x"+ — - 1 , 

1 — x 1 - x 

valid for all x ^ 1, we see that (7.6) is satisfied with f(x) = 1/(1 — ■ x), P,(x) = 1 + 

X + ■ • • + x n , ^ g(x) = a'/(1 — x). Since g(x) -xOasx — > 0, Theorem 7.4 tells us that 

T n j = 1 + + x 2 + ■ • • + x n . 

Integration of this relation gives us the further Taylor polynomial 

T n+1 [- log (1 - x)] := x + - + ^ + . . . + , 

2 3 n + 1 

In (7.7) we may replace x by — x 2 to get 

1 ^ 271+1 

= l-x 2 + x 4 -- ■ ■ ♦ (-l)"x 8n -(-ir^r^ 

1 * X* 1 + X 8 

Applying Theorem 7.4 once more, we find that 

T “(rr ?)= 

k=0 

Integration of this relation leads to the formula 

” x ik+1 

T 2n+1 (arctan x) = \ (- 1)" • 
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7.4 Exercises 


1. Draw graphs of the Taylor polynomials r 3 (sin x) = x «■ x 3 /3 ! and r 5 (sin x) = x — x“j2 > ! + 
x 5 /5!. Pay careful attention to the points where the curves cross the x-axis. Compare these 
graphs with that off(x) = sin x. 

2. Do the same as in Exercise 1 for the Taylor polynomials ^(COS x), Tfcosx), and f(x) = cos x. 


In Exercises 3 through 10, obtain the Taylor polynomials T n f(x) as indicated. In each case, it 
is understood that f(x) is defined for all x for which f(x) is meaningful. Theorems 7.2, 7.3, and 
7.4 will help simplify the computations in many cases. 


71 

3. T n (a x ) =2 


(log a) k 

TT X ■ 


- - 2 <-> 

v 7 Jc = 0 

/ \ n 

5 . T 2n + l\YZTfl) = 2 X2 


v(-l)' c+ ^' £ 

T n [log (1 + X)] =2^ - k . 

k = 1 


7 . 


8 . 


r 2M+ i(log ; /Et?) =2 

' V 1 - x/ to 

T (J- \„y*. 

1 \2 — xf Z,2 k+1 ' 


r 2k + 1 


2k + 1 


9. rj(i+ xr\=z\k)*' 

10 . 


k=0- 

n 


where 


■ T 2n (sin 2 x) = 2 -D W firr, X ik . 
ic~t K } ' 


/ a\ a(a — 1) • • . (a — k + 1) 

W = k! 


[Hint: cos 2x = 1 — 2 sin 2 x.] 


7.5 Taylor’s formula with remainder 

We turn now to a discussion of the error in the approximation of a function / by its 
Taylor polynomial T n f at a point a. The error is defined to be the difference E,(x) = 
f (x) — T n f(x). Thus, iff has a derivative of order n at a, we may write 

71 /*(&)/ \ 

(7.8) f(x) = ^f-— 2 (x - a)" + E >M ■ 

k=0 

This is known as Taylor's formula with remainder E,(x); it is useful whenever we can 
estimate the size of EM We shall express the error as an integral and then estimate the 
size of the integral. T) illustrate the principal ideas, we consider first the error arising 
from a linear approximation. 

theorem 7.5. Assume f has a continuous second derivative f" in some neighborhood of a. 
Then, for every x in this neighborhood, we have 


where 


f(x) =f(a) + f(a)(x - a) + Efx), 
Efx) = J" (x - dt . 
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Proof. From the definition of the error we may write 

Efx) = fix) -f(a) - f(a)(x - a) = £ f{t) dt -f’(a) £ dt = J fl [/'(f) -f’(a) ] dt . 

The last integral may be written as J* u du, where M = /'(f) -/’(«), and v = t — x. Now 
du/dt = f"{t ) and dv/dt = 1, so the formula for integration by parts gives us 

E,(x) = £ u du = UV | n - £ (t — x)f"[f) dt = £ (x — t)f"{t) dt , 

since u = 0 when t = a, and v = 0 when t = x. This proves the theorem. 

The corresponding result for a polynomial approximation of degree n is given by the 
following. 


theorem 7.6. Assume f has a continuous derivative of order n + 1 in some interval 
con taining a. Then, for every x in this interval, we have the Taylor formula 

^ f {k) (a ) 

f(x) = y - ’ (x - a)” + E, (x), 

k=0 

where 

E,(x) = — f (x - f)7 u+1) (f) dt . 
n! Jd 

Proof. The theorem is proved by induction on n. We have already proved it for n = 1. 
Now we assume it is true for some n and prove it for n + 1. We write Taylor’s formula 
(7.8) with n + 1 and with n and subtract to get 

E n+1 ( X ) = E n (x) - «)«+! . 

Now we use the integral for E n (x) and note that (x — a) n+1 /(n + 1) = J*(x — t) n dt {g 
obtain 

1 C x f(n+l)/_\ r x 

E n+1 (x) = - (x _ i) n f {n+1) (t) dt -J ( X - t)’’ dt 

n.\ J a n! Ja 

= - \\x - O"[/ (B+1, (0 -f““‘(u)] dt . 

n! 

The last integral may be written in the form J* u dv, where u= f < n + 1) (t) — f in+1) (a) and v = 
— (x — t) n+1 l(n + 1). Integrating by parts and noting that u = 0 when t = a, and that 
v = 0 when t = x, we find that 


if* if* if* 

E,+,(x) = — I M dv = mm — I V du = — (x — 

nlJd tt! Jo (n + 1)! Ja v 


"(0 dt ■ 


This completes the inductive step from tl to Tl + 1, so the theorem is true for all W > 1. 
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7.6 Estimates for the error in Taylor’s formula 

Since the error E n (x) in Taylor’s formula has been expressed as an integral involving 
the (n + l)st derivative off, we need some further information about/*" 111 before we can 
estimate the size of E n {x). If upper and lower bounds for / ( " +1) are known, we can deduce 
corresponding upper and lower bounds for E,(x), as described in the next theorem. 


theorem 7.7. If the (n + \)st derivative off satisfies the inequalities 


(7.9) 


m </ ,B+1) (0 < M 


for all t in some interval containing a, then for every x in this interval we have the following 
estimates: 

(7J0) m (x ( VfT - Ejx) - M T^T^r if *>« = 

(n + 1)! (n + 1)! 

and 

(7.H) m (C ‘ " < (- ly^EJx) < M (Cl ~ if x < a . 

(n + 1)! (n + 1)! 

Proof Assume first that x > a. Then the integral for E,(x) is extended over the interval 
[a, x]. For each t in this interval we have (x — t) ” > 0, so the inequalities in (7.9) give us 


m 


(x - tf 

n! 


< 


— ^ f n+1) (t) < M 

n! 


(x ~ Q" 

n\ 


Integrating from a to x, we find that 


(7.12) 



M f x 

t)” dt <E,(x) <- (x-ty’dt. 


The substitution u = x — t, du = -dt gives us 

„ . (x - u)" +1 

u an = , 

o n + 1 

so (7.12) reduces to (7.10). 

If x < a, the integration takes place over the interval [x, a]. For each t in this interval 
we have / > x, so (-I)“(x — t) n - (t •- x)” > 0. Therefore, we may multiply the 
inequalities (7.9) by the nonnegative factor (- l)"(x — t) n /n\ and integrate from x to a to 
obtain (7.11). 


(x - 0 


EXAMPLE 1. If f(x) = e x and a = 0, we have the formula 
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Since f^ n+1 \x) = e x , the derivative y <n+1) is monotonic increasing on every interval, and 
therefore satisfies the inequalities e b </ (,i+1) (t) < e c on every interval of the form [b, c]. 
In such an interval, the inequalities for E,(x) of Theorem 7.7 are satisfied with m — e b and 
M = e°. In particular, when b = 0, we have 


x"* 1 

(n + 1)! 


< £«(*) < e c 


■x ,,H 

(n + 1)! 


if 0 < x < c . 


We can use these estimates to calculate the Euler number e. We take b = 0, c = 1, 
X = 1, and use the inequality e < 3 to obtain 


(7.13) 


4 


k\ 


E n ( 1) , when 


J 

(n + 1)! 


< E n (\) < 


(n + 1)! 


This enables us to compute e to any desired degree of accuracy. For example, if we want 
the value of e correct to seven decimal places, we choose an n so that 3/(n + 1)! < J10~ 8 . 
We shall see presently that n = 12 suffices. A table of values of 1 /« ! maybe computed 
rather quickly because 1 jn ! maybe obtained from 1 j{n 1)! by simply dividing by n. The 
following table for 3 < n < 12 contains these numbers rounded off to nine decimals. 
The “round-off error” in each case is indicated by a plus or minus sign which tells whether 
the correct value exceeds or is less than the recorded value. (In any case, this error is less 
than one-half unit in the last decimal place.) 


n 


nl 

n 

n\ 

3 

0.166 

666 667 - 

8 

0.000 024 802 - 

4 

0.041 

666 667 - 

9 

0.000 002 756 - 

5 

0.008 

333 333 + 

10 

0.000 000 276 - 

6 

0.001 

388 889 - 

11 

0.000 000 025 + 

7 

0.000 

198 413 - 

12 

0.000 000 002 + 


The terms corresponding to n = 0, 1, 2 have sum f. Adding this to the sum of the entries 
in the table (for n < 12), we obtain a total of 2.718281830. If we take into account the 
roundoff errors, the actual value of this sum may be less than this by as much as § of a unit 
in the last decimal place (due to the seven minus signs) or may exceed this by as much as 
| of a unit in the last place (due to the three plus signs). Call the sum s. Then all we can 
assert by this calculation is the inequality 2.718281826 < s < 2.718281832. Now the 
estimates for the error £’ 12 (1) give us 0.000000000 < £ lg (l) < 0.000000001. Since e = 
s + £ 12 (1), this calculation leads to the following inequalities for e: 

2.718281826 < e < 2.718281833. 


This tells us that the value of e, correct to seuen decimals, is e = 2.7182818, or that the 
value of e, rounded off to eight decimals, is e = 2.71828183. 
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example 2. Zrrationality of e. We can use the foregoing estimates for the error E n {\) 
to prove that e is irrational. First we rewrite the inequalities in (7.13) as follows: 

1 „ V' 1 . 3 

< e _ \ — < — . 

(n + 1)! k\ (n + 1)! 

k = 0 ■ 

Multiplying through by n!, we obtain 

L „ , „ 3 „ 3 

( 7 - 14 ) n + 1 zLt k\ n+1 4 

k = 0 

if n > 3. For every n, the sum on k is an integer. If e were rational, we could choose n SO 
large that /;! e would also be an integer. But then (7.14) would tell us that the difference 
of these two integers is a positive number not exceeding f, which is impossible. Therefore 
e cannot be rational. 


Polynomial approximations often enable us to obtain approximate numerical values for 
integrals that cannot be evaluated directly in terms of elementary functions. A famous 
example is the integral 

f* 2 

fix) = J 0 e dt 

which occurs in probability theory and in many physical problems. It is known that the 
function f so defined is not an elementary function. That is to say, f cannot be obtained 
from polynomials, exponentials, logarithms, trigonometric or inverse trigonometric 
functions in a finite number of steps by using the operations of addition, subtraction, 
multiplication, division, or composition. Other examples which occur rather frequently 
in both theory and practice are the integrals 


f 


sin t 


dt , 


sin (f 2 ) dt 


^ Vl — k 2 sin 2 1 dt . 


(In the first of these, it is understood that the quotient (sin t)/t is to be replaced by 1 when 
t = 0. In the third integral, A: is a constant, 0 < k < 1.) We conclude this section with 
an example which illustrates how Taylor’s formula may be used to obtain an accurate 
estimate of the integral JJ /2 e _ ‘Vh 


example 3. The Taylor formula for e x with n = 4 gives us 

Y^ Y^ 

(7.15) e * = l +x+ ^ + ± + £- + E (x) . 

2!. 3!. 4!. 


Suppose now that x < 0. In any interval of the form [-c, 0] we have e c < e x < 1, so we 
may use the inequalities (7.11) of Theorem 7.7 with m = e~ c and M = 1 to write 


0 < (— 1) 5 £ 4 (x) < 


(-.x) 5 


5 ! 


f 


x < 0 . 
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In other words, if < 0, then EJx) is negative and > x r, /5 ! . Replacing X by — t 2 in (7.15), 
we have 


(7.16) 


r # 8 , 8 

==1 - t * + v.-v + v + U - t2) ’ 


where — r 10 / 5! < £ 4 ( — l 2 ) < 0. If 0 < t K we find that f 10 / 5! < (§) 10 /5 ! < 0.000 009. 
Thus, if we integrate (7.16) fromO L) 1, we obtain 

li i e -<*dt = - 5 • 2 5 • l\ ” 7 • 2 7 • 3D + 9 • 2 9 • 4) - 0 , 


where 0 < 0 < 0.000 0045. Rounding off to four decimals, we find dt = 0.4613. 


*7.7 Other forms of the remainder in Taylor’s formula 

We have expressed the error in Taylor’s formula as an integral, 

= ^ ( |Vov u+ 1 , (odt. 

It can also be expressed in many other forms. Since the factor (x — t) n in the integrand 
never changes sign in the interval of integration, and since y ( " +1) is continuous on this 
interval, the weighted mean-value theorem for integrals (Theorem 3.16) gives us 

f X (x - t) n f u+v (t) dt = f {n+x \c) \\x - ty dr = / u> l) (c) ( - — , 

Ja n + 1 

where c lies in the closed interval joining a and x. Therefore, the error can be written as 

E n (x) TH * - u)” +1 . 

C n 1)! 

This is called Lagrange’s f orm of the remainder . It resembles the earlier terms in Taylor’s 
formula, except that the derivative f ( " +1) (c) is evaluated at some unknown point c rather 

than at a. The point c depends on x and on n, as well as on f. 

Using a different type of argument, we can drop the continuity requirement on f (n fl) 
and derive Lagrange’s formula and other forms of the remainder under a weaker hypothesis. 
Suppose that f ,n+1> exists in some open interval (h, k ) containing the point a, and assume 
that f in) is continuous in the closed interval [h, k]. Choose any x ^ a in [h, k]. For 

simplicity, say x > a. Keep x fixed and define a new function F on the interval fa, x] as 

follows : 

F(t) = m + 

*—< k\ 

k=i 

Note that F(x) = fix) and F(a) = T n f i^a), so F(x) — F(a) = E,(x). The function Fis 
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continuous in the closed interval [3, x] and has a derivative in the open interval (3, x). If 
we compute F'(t), keeping in mind that each term of the sum defining F(t) is a product, we 
find that all terms cancel except one, and we are left with the equation 


F '(t) = ^~f tn -(i) . 

n\ 

Now let G be any function that is continuous on [a, x] and differentiable on (a, x ). Then 
we can apply Cauchy’s mean-value formula (Theorem 4.6) to write 

G'{c)[F{x) - F{a)\ = F'(c)[G(x) - G(a)] , 

for some c in the open interval (a, x). If G’ is nonzero in (a, x), this gives the following 
formula for the error Efx) : 

£.(*) = [C« - G(«)] , 

G(c) 

We can express the error in various forms by different choices of G. For example, taking 
G(t) = (x — f)" +1 , we obtain Lagrange’s form. 


E n(x) = 


f in+v (c) 

(n + D! 


(x - u) K+1 , 


where a < C < X . 


Taking G(f) = X — t, we obtain another formula, called Cauchy’s form of the remainder. 


E n (x) -=- J (x — c) n (x — a) , where a < C < X . 

n! 

If G(t) = (x — t) v , wherep > 1, we obtain the formula 


r(n+i)( \ 

E n (x) — ■ (x — c) n+1 ~ p (x — a)” 

n\ p 


where a < C < X . 


7.8 Exercises 


Examples of Taylor’s formula with remainder are given in Exercises 1, 2, and 3. In each case 
prove that the error satisfies the given inequalities. 


1. sinx 


= 2 ' 

*=1 


lyc-ijZk- 1 + 


(2k - 1)! 


Mn(x), \E 2n (x)\ < 


| x |2n+l 
(2n + 1 )! ' 


( — 1)^ 

2-COSx E 2nn (x), 

k=Q ' ' 

3. arctan x =^> j— E& n (x), 

k~0 + 


l^2n+iWI ^ Qn + 2)! ' 

x 2n+l 

\E 2 „(x)\ < 2^—[ if 
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4. (a) Obtain the number r = V15 — 3 as an approximation to the nonzero root of the equation 
X 2 = sin x by using the cubic Taylor polynomial approximation to sin x. 

(b) Show that the approximation in part (a) satisfies the inequality 


l sinr " r2|< 200’ 

given that \/l5 — 3 < 0.9. Is the difference (sin r — r 2 ) positive or negative? Give full 
details of your reasoning. 

5. (a) Use the cubic Taylor polynomial approximation to arctan x to obtain the number r = 
(V21 — 3)/2 as an approximation to the nonzero root of the equation arctan x = x 2 . 

(b) Given that 'sJzX < 4.6 and that 2 16 = 65536, prove that the approximation in part (a) 
satisfies the inequality 


k 2 - arctan r \ < — . 


Is the difference (r 2 — arctan r) positive or negative? Give full details of your reasoning. 

f 1 ] + v 30 C 

6 Prove that — dx = 1 4 , where 0 < c < 1. 

Jo 1 + X b0 31 

p /2 ] 

7. Prove that 0.493948 < r ~ —3 dx < 0.493958. 

Jo 1 

8 . (a) If 0 < x < show that sin x = x — x 3 ^! + r(x), where |r(x)| < (|) 5 / 5!. 

(b) Use the estimate in part (a) to find an approximate value for the integral J (/ 2/2 sin (x 2 ) dx. 
Make sure you give an estimate for the error. 

9. Use the first three nonzero terms of Taylor’s formula for sin x to find an approximate value 
for the integral JJ (sin x)/xdx and give an estimate for the error. [It is to be understood that 
the quotient (sin x)/x is equal to 1 when x = 0 .] 

10. This exercise outlines a method for computing ir, using Taylor’s formula for arctan x given in 
Exercise 3. It is based on the fact that n is nearly 3.2, so is nearly 0.8 or and this is nearly 

4 arctan Let a = arctan 1, /i = 4a — i 71 '. 

(a) Use the identity tan(/f + B) = (tan A + tan B)/( 1 — tan A tan B) with A = B = a and 
then again with A = B = 2a to get tan 2a = ^ and tan 4a = yff. Then use the identity 
once more with A = 4a, B = — to obtain tan /S = 2 ¥ 9 . This yields the following 
remarkable identity discovered in 1706 by John Machin (1680-1751): 

it = 16 arctan | — 4 arctan 239 . 

(b) Use the Taylor polynomial r u (arctan x) with x = | to show that 

3.158328934 < 16 arctan i < 3.158328972. 

(c) Use the Taylor polynomial T 3 ( arctan x) with x = 2 I 9 to show that 

-0.016736309 < -4 arctan < -0.016736300. 

(d) Use parts (a), (b) and (c) to show that the value of 7 t, correct to seven decimals, is 
3.1415926. 
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7.9 Further remarks on the error in Taylor’s formula. The o-notation 

Iffhas a continuous ( n + l)st derivative in some interval containing a point a, we may 
Write Taylor’s formula in the form 


« f (k h a ) 

(7.17) /(x) = 2^ (x — a)" + E,(x) . 

t= o 

Suppose we restrict x to lie in some closed interval [a — c, a + c] about a, in which f in+1> 
is continuous. Then f (n +D is bounded on this interval and hence satisfies an inequality of 
the form 

|/ (B + U (0I < M , 

where M > 0. Hence, by Theorem 7.7, we have the error estimate 


|£„(*)l < M 


x — a 


>n+l 


(n + 1)! 


for each x in [a — c, a + c]. If we keep x a and divide this inequality by |x — a\ n , we 
find that 


0 < 


E n (x) 

(. x - a) n 


< 


M 


0 n + 1 )! 


x - 


a 


If now we let x — > a , we see that E n (x)l(x — a) n — > 0. We describe this by saying that the 
error E,(x) is of smaller order than (x — a) n as x — > a. 

In other words, under the conditions stated, f(x) may be approximated near a by a 
polynomial in (x — a) of degree n, and the error in this approximation is of smaller order 
than (x — a) n as x — ► a. 

A special notation, introduced in 1909 by E. Landau,? is particularly appropriate when 
used in connection with Taylor’s formula. This is called the o-notation (the little-oh 
notation) and it is defined as follows. 


definition. Assume g(x) 5 ^ 0 for all x a in some interval containing a. The notation 

f(x) = o(g(x)) as x —*■ a 


means that 


g( x ) 


The symbol f{x) = o(g(x)) is read ‘f(x) is little-oh of g(x),” or “f(x) is of smaller order 
than g(x),” and it is intended to convey the idea that for x near a, f(x) is small compared 
with g(x). 


| Edmund Landau (1877-1938) was a famous German mathematician who made many important contri- 
butions to mathematics. He is best known for his lucid books in analysis and in the theory of numbers. 
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EXAMPLE 1. f(x) = o(l) as x — > a means that f(x) — > 0 as x — *■ a. 

fix) 

EXAMPLE 2. f(x) = o(x) as x — > 0 nieans that - — >■ 0 as x — > 0. 

x 

An equation of the form f(x) = h(x) + o(g(x)) is understood to mean that f(x) — h(x) = 
o(g(x)) or, in other words, [f(x) — h(x)]jg{x) — 0 as x a. 

sinx — x sin x 

example 3. We have sin x = x + o(x) because 5 = 1 — *■ 0 as X — ► 0. 

x x 


The foregoing remarks concerning the error in Taylor’s formula can now be expressed 
in the o-notation. We may write 

n Ak) ( \ 

/(*) = ^ (X " a) ” + 0((x ~ a S x ^ a ’ 
k=0 

whenever the derivative /<”+!) is continuous in some closed interval containing the point a. 
This expresses, in a brief way, the fact that the error term is small compared to (x — a) n 
when x is near a. In particular, from the discussion of earlier sections, we have the following 
examples of Taylor’s formula expressed in the o-notation: 


1 - x 


= 1 + x + x 2 + ' • • + x n + o(x n ) as x — »• 0 . 


x 2 x 3 x 4 


log (1 + X) = X + + . • . + (— 1)" 1 — + o(x n ) as x ~> 0 , 

2 3 4 n 


e x = 1 + x + -~ + '. . + ■— + o(x”) as x — ► 0 , 

2 ! n \ 


v-3 Y ^72 — 1 

sinx = x--+--- + ..' (-1)"- 1 — 

3!. 5!. 7!. + (2« - 1)! 


+ o(x 2n ) as x — *■ 0 


x 2 x 4 x 6 

cosx = l 

2!. 4! . 6! 


+ (-D B — + 0(x in+1 ) as x-0. 

(2n) ! 


arctan x = x- j + j- y+ -- - + (-1)” -1 j + o(x 2n ) as x -> 0 . 


In calculations involving Taylor approximations, it often becomes necessary to combine 
several terms involving the o-symbol. A few simple rules for manipulating o-symbols are 
discussed in the next theorem. These cover most situations that arise in practice. 
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theorem 7.8. algebra of o-symbols . As x — »■ a, we have the following: 

(a) o(g(.x)) ± o(g(x)) = o(g(x)). 

(b) o(cg(x)) = o(g(x)) if c^O. 

(c) f(x) ■ o(g(x)) = o(f(x)g(x)). 

(d) o(o(g(x})) = o(g(x)). 

(e) X —— = 1 - g(x) + o(g(x)) if g(x) 0 as x-ta. 

1 + g(x) 

Proof. The statement in part (a) is understood to mean that if ff x) = o(g(x)) and if 
fi(x) = o(g(x)), then ffx) J; ffx) = o(g(x)). But since we have 

ffx) ± ffx) _ fix) ffx) 

g(x) g(x) g(x) ’ 

each term on the right tends to 0 as x — > a, so part (a) is proved. The statements in (b), 
(c), and (d) are proved in a similar way. 

To prove (e), we use the algebraic identity 


1 

1 + u 


1 — u + u 


u 

1 + u 


with u replaced by g(x) and then note that 


g(x) 

1 + g(x ) 


0 as x-ta. 


example 1. Prove that tan x = x fx 3 + o(x 3 ) as x 0. 

Solution. We use the Taylor approximations for the sine and cosine, From part (e) of 
Theorem 7.8, with g(x) = — |.x 2 + o(x 3 ), we have 


COS x 

Therefore, we have 


|x 2 + n(x 3 ) 


= 1 + - x 2 + o(x 2 ) as x -*■ 0 . 


tan x =- 


sin x 
COSx 


-(*-1 

V 6 


x d + o(x ) 1 1 1 + - x + o(x 2 ) I = x + - x 3 + o(x 3 ) ■ 


') 


1 


( X 11 X ^ \ 

1 + + o(x 2 ) I 

2 24 7 


as x — >• 0. 


Solution. Since (1 + x) 1 /* = e (1 / x)log(1 + x, ) we begin with a polynomial approximation 
to log (1 + x). Taking a cubic approximation, we have 


x 2 .x 3 . , 3 , log (1 + x) , x x 2 

” 7 7 ( ~ — = i- j+ o(x’) 


log (1 + x) = x 


X 


5 
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and so we obtain 

(7.18) (1 + xfl x = exp (1 - x/2 + x 2 /3 + o(x 2 )) = e . e u , 

where u = -x/2 + x 2 / 3 + o(x 2 ). But as w *-> 0, we have <?“=! + «+ |-« 2 + o(w 2 ), so we 
obtain 

= 1 - 2 + 7 + 0( * 2) + 2 (" i + 7 + 0( * 2) ) + 0(x2) = 1 " f + 77 + o(x2) • 

When we use this in Equation (7.18), we obtain the desired formula. 


7.10 Applications to indeterminate forms 

We have already illustrated how polynomial approximations are used in the computation 
of function values. They can also be used as an aid in the calculation of limits. We illustrate 
with some examples. 

example 1. If ci and b are positive numbers, determine the limit 


lim 


a x - ib x 


*->0 x 


Solution. We cannot solve this problem by computing the limit of the numerator and 
denominator separately, because the denominator tends to 0 and the quotient theorem on 
limits is not applicable. The numerator in this case also tends to 0 and the quotient is said 
to assume the “indeterminate form 0/0” as x — >■ 0. Taylor’s formula and the o-notation 
often enable us to calculate the limit of an indeterminate form like this one very simply. 
The idea is to approximate the numerator a” — If by a polynomial in x, then divide by x 
and let x — >■ 0. We could apply Taylor’s formula directly to f(x) = a” — b x but, since 
a x — e xtoia and b x = e xloeb it is simpler in this case to use the polynomial approximations 
already derived for the exponential function. If we begin with the linear approximation 


e* — 1 + t + o(t) as t — 0 


and replace / by x log a and x log b, respectively, we find 

a x := 1 + x log a + o(x) and b x = 1 + x log b + o(x) as x — > 0 . 

Here we have used the fact that o(x log a) = o(x) and o(x log b) = o(x). If now we subtract 
and note that o(x) — o(x) = o(x), we find a x — b x = x(log a — log b) + o(x). Dividing 
by x and using the relation o(x)/x = o(l), we obtain 

= log — + o(l) — > log - as x — > 0 , 

b b 


x 
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EXAMPLE 


2 . 


Prove 


that lim a .-, 0 



1 

3 ' 


Solution. We use Example 1 of Section 7.9, and Theorem 7.8(e) to write 


1 l 1 

X + |-x 3 + o(x 3 ) = X 1 + §x 2 + o(x 2 ) 

- X 2 + o(x 2 )) = - — - X + o(x). 

3 / x 3 

Hence, we have 

~ ( cot x — - 1 = — - + 0 ( 1 ) — - a s x — ► 0 . 

x \ x] 3 3 


cot x 

tan x 



EXAMPLE 


3. Prove that lim. 


log (1 + ax) _ 


£->0 


= a 


for every real a. 


Solution. If a = 0, the result holds trivially. If a / 0, we use the linear approximation 
log (1 + x) = x + o(x). Replacing x by ax, we obtain log (1 + ax) = ax + o(ax) = 
ax + o(x). Dividing by x and letting x — > 0, we obtain the limit a. 

example 4. Prove that for every real a, tve have 

(7.19) Urn (1 + axf x = e a , 

*-*o 

Solution. We simply note that (1 + ox) 1 /* — gU/*)l°B(i+«®) and use the result of Example 
3 along with the continuity of the exponential function. 

Replacing ax by y in (7.19), we find another important limit relation: 

lim (1 + y) “/» = e a ■ 
v-+ 0 

Sometimes these limit relations are taken as the starting point for the theory of the 
exponential function. 


7.11 Exercises 

1 . Find a quadratic polynomial P(x) such that 2 X = P(x) -f o(x 2 ) as x — > 0. 

2. Find a cubic polynomial P(x) such that x cos x = P(x) + o((x — l) 3 ) as x — ► 1. 

3. Find the polynomial P(x) of smallest degree such that sin (x — x 2 ) = P(x) + o(x 6 ) as x — > 0. 

4. Find constants a, b, c such that log x = a + b(x — 1) + c(x — ■ l) 2 + o((x — l) 2 ) as x — >■ 1. 

5. Recall that cos x = 1 — ^x 2 + o(x s ) as x — > 0. Use this to prove that x -2 (1 — cos x) 
as x -> 0. In a similar way, find the limit of x -4 (l — cos 2x — 2x 2 ) as x — >• 0. 


tC]M 
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Evaluate the limits in Exercises 6 through 29. 
sin ax 


6. lim 

£— H 

7. lim 

8. lim 
x-o 

9. lim 


o sin 6x’ 
tan 2x 
0 sin 3x’ 
sin x «• x 
x-o X s 

log (1 +x) 

x-*o e 2x - 1 


10. lim 
«— *0 


1 — COS 2 X 

x t a n x 


11. lim 

x-*( 

12. lim 

aj — ►( 

13. Jim 


sin x 


2.0 arctan x 
a x - 1 


o b x - V 
log X 

x X 2 + X — 1 


6 5* 1 


1 - COS X 1 


r sin x‘ 


14. lim ■ 

*->■0 

x(e x + 1) - 2(e* - 1) 

15. lim r 

* 

16. lim l0g(1+ ^- X , 


x-*Q 


COS X 


17. lim 


COS x 


x- 


x — 8 7 


18 , im W»Wgl<togf> 

. x -+1 (* + 5 X* - 1) 

cosh x — cos x 


19. lim 
x-o 


XT 
3 tan 4x 


1 2tanx 


20. lim , — ^ ; 

2^0 3 sin 4x — 12sinx 


21. lim 

as— K) 

22. lim 
as->0 




COS (sin x) — COS x 


23. lim 

24. lim (x + e 2x ) Vx . 

2-.0 

25 lim (1 + J,) ‘" ~ * 

*-*0 * 

/ (1 + x) llx ^ x 

m S(-7~) ' 

( arcsin x W*' 

— 

x 

28. lim 


Jlx‘ 

29. lim ( — 1 i — , 

* .l' Jog x x 1 / ' 


30. For what value of the constant a willx _2 (t?“ - e* - x) tend to a finite limit as x O? What 
is the value of this limit ? 

31. Given two functions/and^ with derivatives in some interval containing 0, whereg is positive. 
Assume also f(x) = o(g(x)) as x — >• 0. Prove or disprove each of the following statements: 

(a) J 0 /( 0 dt = o( g{t) dt)j as x -> 0 , (b) f (x) = o(g'(x)) as X ->• o . 

32. (a) If g{x) = o(l) as x — > 0, prove that 


j +g ( x ) = 1 ~g< x ) + + o(g\x)) as X->0 , 


x 3 2x 5 

J + T? 

33. A function f h as a continuous third derivative everywhere and satisfies the relation 


(b) Use part (a) to prove that tan x = x + y + — + 0 ( x 5) a s x -> 0 . 
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Compute /(0),/'(0),/"(0), and limj 

a:— ►O V 


1 + 


/(*)' 


1/a? 


[Hint: If lim X ^ 0 g(x) = A, then g(x) = A + o(l) as x -> 0.] 


7.12 L’Hopital’s rule for the indeterminate form 0/0 

In many examples in the foregoing sections we have calculated the limit of a quotient 
f(x)lg(x) in which both the numerator f(x) and the denominator g(x) approached 0. In 
examples like these, the quotient /(x)/g(x) is said to assume the “indeterminate form 0/0.” 

One way to attack problems on indeterminate forms is to obtain polynomial approxima- 
tions to/(x) and g(x) as we did in treating the above examples. Sometimes the work can 
be shortened by use of a differentiation technique known as L’HdpitaVs rule A The basic 
idea of the method is to study the quotient of derivatives f'(x)lg'(x) and thereby to try to 
deduce information about f(x)lg(x). 

Before stating L’Hopital’s rule, we show why the quotient of derivatives f(x)lg’(x) bears 
a relation to the quotient f(x)lg(x). Supposefand g are two functions with/(fl) = g(a) = 0. 
Then, for x ^ a, we have 


. g(x) _ fix) -/(«) _ #*) - / (a) j g(a) 

g(x) g(x) - g(fl) x - a / x - u 

If the derivatives f(a) and g’(a) exist, and if g’(u) ^ 0, then as x — > a the quotient on the 
right approaches f{a)jg'(a) and hence f{x)jg(x) — > f'(a)lg'(a). 


1 - e 2x 

example. Compute lirn^,, 


Solution. Here f(x) =1 _ e 2x and g(x) = x, SO f'(x) = -2e 2x , g’(x) = 1. Hence we 
have/'(0)/g'(0) = - 2 , so the limit in question is -2. 

In L’Hopital’s rule, no assumptions are made about f g or their derivatives at the point 
x = a. Instead, we assume that f(x) and g(x) approach 0 as x — >■ a and that the quotient 
f '(x)lg'(x ) tends to a finite limit as x — > a. L’Hopital’s rule then tells us that f(x)!g{x) tends 
to the same limit. More precisely, we have the following. 


theorem 7.9. l’hopital’s RULE for 0/0. Assume f and g have derivatives f’(x) and 
g ’(x) at each point x of an open interval (a, b), and suppose that 


(7.20) 


lirn/(x) = 0 and lim g(x) = 0 . 

x~*a+ 


•f In 1696, Guillaume Frangois Antoine de L’Hopital (1661-1704) wrote the first textbook on differential 
calculus. This work appeared in many editions and played a significant role in the popularization of the 
subject. Much of the content of the book, including the method known as “L'Hopital’s rule,” was based 
on the earlier work of Johann Bernoulli, one of L’HopitaTs teachers. 
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Assume also that g'(x) jt- 0 for each x in (a, b). If the limit 

(7.21) lim 

r-a+ g’(x) 

exists and has the value L, say, then the limit 

(7.22) lim 

a-n+ g(x) 

also exists and has the value L. 

Note that the limits in (7.20), (7.21), and (7.22) are “right-handed.” There is, of course, 
a similar theorem in which the hypotheses are satisfied in some open interval of the form 
(b, a) and all the limits are “left-handed.” Also, by combining the two “one-sided” 
theorems, there follows a “two-sided” result of the same kind in which x — >- a in an 
unrestricted fashion. 

Before we discuss the proof of Theorem 7.9, we shall illustrate the use of this theorem 
in a number of examples. 

example 1. We shall use L’Hopital’s rule 1° obtain the familiar formula 

sin x 

(7.23) lim = 1 

Here f(x) = sin x and g(x) = x. The quotient of derivatives is f(x)lg'(x) = (cos x)/l and 
this tends to 1 as x — > 0. By Theorem 7.9 the limit in (7.23) also exists and equals 1. 

example 2. To determine the limit 


x — tan x 

lim 7 

a:-*o x — sin x 

by L’HopitaTs rule, we let f(x ) = X — tan X, g(x) = x — sin x, and we find that 

7 - f’(x) _ 1 - sec 2 x 

g'(x) 1 - COS X . 

Although this, too, assumes the form 0/0 as x— >■ 0, we may remove the indeterminacy at 
this stage by algebraic means. If we write 


1 — sec 2 x — \ — 

COS X 


cos 2 X — 1 
cos 2 X 


( 1 + COS x)(l - COS x) 
cos 2 X 


the quotient in (7.24) becomes 


A*) 

g'(x) 


1 + COS x 

cos 2 x 


and this approaches -2 as x — ► 0. Notice that the indeterminacy disappeared when we 
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canceled the common factor 1 — cos x. Canceling common factors usually tends to 
simplify the work in problems of this kind. 

When the quotient of derivatives f'(x)/g'(x) also assumes the indeterminate form 0/0, 
we may try L’Hopital’s rule again. In the next example, the indeterminacy is removed 
after two applications of the rule. 


example 3. For any real number c, we have 


x c — cx + c — 1 _ cx c-1 — c _ jj m c(c — l)x c 2 c(c — 1) 
:— ♦! (x — 1)” a:— ►! 2(x — 1) x -*l 2 


In this sequence of equations it is understood that the existence of each limit implies that 
of the preceding and also their equality. 


The next example shows that L’Hopital’s rule is not infallible. 


example 4. Let f(x) = e- 1 !* if x 0, and let g(x) = x. The quotient f(x)jg(x) assumes 
the indeterminate form 0/0 as x— >• 0+, and one application of L’Hopital’s rule leads to 
the quotient 

f\x) _ (l/x 2 )e~^ x _ e~ llx 

g'(x)~ i “ V 

This, too, is indeterminate as x — » 0 + , and if we differentiate numerator and denominator we 
obtain (l/x 2 )e~ 1 l x /(2x)= e^ 1 ^ x /(2x 3 ). After n steps we are led to the quotient e -1 ^/(/i ! a'™ + 1 ), 
so the indeterminacy never disappears by this method. 

example 5. When using L’Hopital’s rule repeatedly, some care is needed to make certain 
that the quotient under consideration actually assumes an indeterminate form. A common 
type of error is illustrated by the following calculation: 


lim 

*-*l 



= lim 

<t-*l 2x 


6x — 2 


1 



The first step is correct but the second is not. The quotient (6x — 2)/(2x — 1) is not 
indeterminate as x -> 1. The correct limit, 4, is obtained by substituting 1 for x in 
(6x — 2 )/( 2x - 1). 


example 6. Sometimes the work can be shortened by a change of variable. For example, 
we COllld apply L’Hopital’s rule directly to calculate the limit 


.. Vx 

lim — =, 

*-o+ 1 - 


but we may avoid differentiation of square roots by writing t = y/x and noting that 


lim 

ai“»0+ 


Vx 

1 - e 2 v * 


- lim - 

(- 0 + 1 



1 


i-o+ — 2e 


2 1 


l 

2 ’ 


We turn now to the proof of Theorem 7.9. 
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Proof, We make use of Cauchy’s mean-value formula (Theorem 4.6 of Section 4.14) 
applied to a closed interval having a as its left endpoint. Since the functionsf and g may 
not be defined at a, we introduce two new functions that are defined there. Let 

F(x) = f(x) if x a , F(a) = 0 , 

G(x) = g{x) if x 7 i a , G(a) = 0 . 

Both F and G are continuous at a. In fact, if a < x < b, both functions F and G are 
continuous on the closed interval [a, x] and have derivatives everywhere in the open interval 
(a, x). Therefore Cauchy’s formula is applicable to the interval fa, x] and we obtain 

[F(x) - F(a)]G'(c) = [G (x) - G(a)]F’(c) . 

where c is some point satisfying a < c < x. Since F(a) = G (a) = 0, this becomes 

f(x)g’(c) = g(x)f'(c ) , 

Now g’(c) 5 ^ 0 [since, by hypothesis, g’ is never zero in (a, b)] and also g(x) ^ 0. In fact, 

if we had g(x) = 0 then we would have G(x) = G(a) = 0 and, by Rolle’s theorem, there 

would be a point Xj between a and x where G'(X]j = 0, contradicting the hypothesis that 
g’ is never zero in (a, b). Therefore we may divide by g’(c) and g(x) to obtain 

/(*) f(c) 

,?(*) g'(c) 

As x —*■ a, the point c —> a (since a < c < x) and the quotient on the right approaches L 
[by (7.21)]. Hence, f(x)/g(X )i to approaches L and the theorem is proved. 


7.13 Exercises 

Evaluate the limits in Exercises 1 through 12. 

, 3x 2 + 2x - 16 

1. lim- 


*— 2 


x 2 - x - 2 . 

x 2 — 4x + 3 


2. lim — 

; “ 3 2x 2 - 13X + 21 


3. lim- 


sinh x — sin x 


x“ 


(2 — x)e x — x — 2 

4 lirr. ^ 

xo x 

. log (COS ax) 

5 . lim-: — r-j-, 

log Cos bx) 

x — sin x 
6 'x™+ (x sin x) 3/2 ' 


7 lim 

x-*a+ 


Vx - Va + V'x — a 




•3 _ rfi- 


8. lim - — — — . 

1 - x + log x 


arcsin 


2x — 2 arcsin x 


v-3 


9. lim 

X — *0 

in .X COt X — 1 

10. hm- g . 

X '*-0 ^ 

11 . 

X - 1 


12. lim — 1— J 
*^ 0 + X\/x \ 


Vx 


Vx) 


arctan — - b arctan 
Vx\ a 0 
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13. Determine the limit of the quotient 


(sin 4x)( sin 3x) 
x sin 2x 


as x — ► 0 and also as x -*■ \n, 

14. For what values of the constants a and b is 

lim (x -3 sin 3x + ax~ 2 + b) = O? 

x-*-o 

1 C x t 2 dt 

15. Find constants a and b such thatlini „ : — — = 1 . 

* bx - sin xj fl y ' a + t 

16. A circular arc of radius 1 subtends an angle of x radians, 0 < x < \tt, as shown in Figure 
7.2. The point C is the intersection of the two tangent lines at A and B. Let T(x) be the area of 


B 



Figure 7.2 Exercise 16. 

triangle ABC and let S(x) be the area of the shaded region. Compute the following: (a) T(x); 
(b) S(x); (c) the limit of T(x)IS(x) as x —y 0 +. 

17. The current Z(t ) flowing in a certain electrical circuit at time t is given by 

7(t) = | (1 - e~ Rt l L ) 

where E, R, and L are positive numbers. Determine the limiting value of Z(t) as R —>■ 0 +. 

18. A weight hangs by a spring and is caused to vibrate by a sinusoidal force. Its displacement 
f(t) at time t is given by an equation of the form 

A 

f(t ) = ( sin kt - sin ct) , 

where A, c, and k are positive constants, with c^I. Determine the limiting value of the dis- 
placement as c — > k 


7.14 The symhols -foo and —00. Extension of L’Hopital’s rule 

L’HopitaTs rule may be extended in several ways. First of all, we may wish to consider 
the quotient f(x)lg(x ) as x increases without bound. It is convenient to have a short 
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descriptive symbolism to express the fact that we are allowing x to increase indefinitely. 
For this purpose, mathematicians use the special symbol + oo, called “plus infinity.” 
Although we shall not attach any meaning to the symbol + oo by itself, we shall give 
precise definitions of various statements involving this symbol. 

One of these statements is written as follows: 

Urn f(x) = A , 

£-> + 00 

and is read “The limit off(x), as x tends to plus infinity, is A.” The idea we are trying to 
express here is that the function values/(.v) Can be made arbitrarily close to the real number 
A by taking x large enough. To make this statement mathematically precise, we must 
explain what is meant by “arbitrarily close” and by “large enough.” This is done by means 
of the following definition : 

definition . The symbolism 

lim f(x) = A 
£-++00 

means that for every number e > 0, there is another number M > 0 (which may depend on ® ) 

such that 

| f(x) — A\ < € whenever x > M . 

Calculations involving limits as x— »■ + 00 may be reduced to a more familiar case. We 
simply replace x by \jt (that is, let t = 1/x) and note that t — > 0 through positive values as 
X — > + 00, More precisely, we introduce a new function f where 

(7.25) +0 =/(j) i f tft 0, 

and simply observe that the two statements 

lim f(x) = A and lim F(t) = A 

£-♦+00 i~+0+ 

mean exactly the same thing. The proof of this equivalence requires only the definitions 
of the two limit symbols and is left as an exercise. 

When we are interested in the behavior off(x) for large negative x, we introduce the 
symbol — oo (“minus infinity”) and write 

lim f(x) = A 
*-►-00 

to mean: For every e > 0, there is an M > 0 such that 


\f{x) — A\ < e whenever x < —M. 
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If Fis defined by (7.25), it is easy to verify that the two statements 


lim f(x) = A and lim F(t) = A 
a!+-* t -.fl- 


are equivalent. 

In view of the above remarks, it is not surprising to find that all the usual rules for 
calculating with limits (as stated in Theorem 3.1 of Section 3.4) also apply to limits as 
x — ► i oo, The same is true of L’Hopital’s rule which may be extended as follows: 

theorem 7.10. Assume that f and g have derivatives f'(x) and g’(x) for all X greater 
than a certain fixed M > 0. Suppose that 

lim f(x) = 0 and lim g(x) = 0 , 

X-*-\-CC £-»+00 

and that g’(x) ^ 0 for x > M. If f '( x)lg'(x ) tends to a Zimit as x —*■ + oo, then f{x)jg{x) 
also tends to a limit and the two limits are equal. In other words, 

(7.26) lim = L implies lim = L 

I-.+0C g (x) *-*+oO g(x) 

Proof. Let F(t) = /(I/O and G(t) = g(l ft). Then f{x)lg{x) = F{t)\G{t) if t = 1 jx, and 
t — > 0+ as x — > + 00. Since F(t)/G(t ) assumes the indeterminate form 0/0 as t -> 0 + , we 
examine the quotient of derivatives F'(t)jG'(t). By the chain rule, we have 



Also, G’(t) ^ 0 if 0 < t < 1 /Af. When x = l/t and x > M, we have F'(t)/G'(t ) = f '{x)jg'(x) 
since the common factor — l/t 2 cancels. Therefore, if f'(x)jg’(x) — >■ L as x ->• + oo, then 
F’(t)/G'(t ) -> L as t ->■ 0+ and hence, by Theorem 7.9, F(t)jG(t ) -> L. Since F(t)/G(t ) = 
f(x)/g(x) this proves (7.26). 

There is, of course, a result analogous to Theorem 7.10 in which we consider limits as 
x->- —co. 

7.15 Infinite limits 

In the foregoing section we used the notation x — >■ + oo to convey the idea that x takes 
on arbitrarily large positive values. We also write 

(7.27) lim/(x) = +co 

%~*a 


or, alternatively. 


(7.28) 


/(*)-► +cc 


as x 


a 
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to indicate that f(x) takes arbitrarily large values as x approaches a. The precise meaning 
of these symbols is given in the following definition. 

definition. The symbolism in (7.27) or in (7.28) means that to every positive number 
M (no matter how large ), there corresponds another positive number S ( which may depend on 
M) such that 

f(x) > M uhenever 0 < \x — a\ < 6 . 

Vf (x) > M whenever 0 < x — a < <5, we write 

lim/(x) = + ao , 

a-a-t 

and we say that f (x) tends to plus infinity as x approaches a from the right. Zff (x) > M 
whenever 0 < a — x < <5, we write 


lim f(x) = + oo , 

i-*a— 


and we say that f(x) tends to plus infinity as x approaches a from the left. 


The symbols 


lim/(x) = — oo , lim/(x) = — oo , and lim f(x) = — oo 

%~*(i x~*a+ x~^a— 

are similarly defined, the only difference being that we replace f(x) > M by fix) < -M. 
Examples are shown in Figure 7.3. 




lim/(x) = + co 


Figure 7.3 Infinite limits. 
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It is also convenient to extend the definitions of these symbols further to cover the cases 
when x — > ^ 00, Thus, for example, we write 

lim f(x) = + oo 

X -» + oo 

if, for every positive number M, there exists another positive number X such that f(x) > M 
whenever x > X. 

The reader should have no difficulty in formulating similar definitions for the symbols 

lim f(x) = + cc , lim f(x) = — oo , and lim f(x) = — oo . 

*-*-00 *-*+00 *-*-00 

examples. In Chapter 6 we proved that the logarithm function is increasing and un- 
bounded on the positive real axis. We may express this fact briefly by writing 

(7.29) lim logx = + co . 

*-*+co 

We also proved in Chapter 6 that log x < 0 when 0 < x < 1 and that the logarithm has 
no lower bound in the interval (0, 1). Therefore, we may also write lim a .^ 0+ log x = — oo, 
From the relation that holds between the logarithm and the exponential function it is 
easy to prove that 

(7.30) lim e x = + 00 and lim e x = 0 (or lim e~ x = 0) . 

*— >+co *-*-00 *-*+00 

Using these results it is not difficult to show that for a > 0 we have 

lim x a = + oo and lim — = 0. 

*-*+00 *-* + 00 x 

The idea is to write X x = e 0,log;c and use (7.30) together with (7.29). The formulas in (7.30) 
also give us the relations 

lim g -1 /® = + oo and lim g -1 /* = 0 . 

*-*0— *-*o+ 

The proofs of these statements make good exercises for testing a reader’s understanding 
of limit symbols involving i CO, 


7.16 The behavior of log x and e x for large x 

Infinite limits lead to new types of indeterminate forms. For example, we may have a 
quotient f{x)jg{x) where both fix) — > + cc and g(x) — >■ + co as x o (or as x — >- co). 
In this case, we say that the quotient/(A')/g( x ) assumes the indeterminate form oo/ oo, There 
are various extensions of L’Hopital’s rule that often help to determine the behavior of a 
quotient when it assumes the indeterminate form oo/oo, Flowever, we shall not discuss 
these extensions because most examples that occur in practice can be treated by use of the 
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following theorem which describes the behavior of the logarithm and the exponential for 
large values of x. 


THEOREM 7.11. If a > 0 and b > 0, we have 


(7.31) 

and 

(7.32) 


lint 


(log xf _ 0 


a;-^+oo 


lim — = 0. 

«-»+« g ax 


Proof. We prove (7.31) first and then use it to derive (7.32). A simple proof of (7.31) 
may be given directly from the definition of the logarithm as an integral. If c > 0 and 
t > 1, we have < ? c_1 . Hence, if x > 1, we may write 


0 < log x = J* - dt <J f 1 dt 



Therefore, we have 


Q < ( log xy < x hc ~ a 
x a c b 


for every C > 0 , 


If we choose C = ^a/b, then x bc ~ a = x~ a / 2 which tends to 0 as x ->-+00, This proves (7.31). 
To prove (7.32), we make the change of variable t = e”. Then x = log t, and hence 
X b le' ,;t = (log t) b /t a . But l —*■ + cc as x — > + oo, so (7.32) follows from (7.31). 

With a natural extension of the o-notation, we can write the limit relations just proved 
in the form 

(log x)” = o(x a ) as x —y co , 
and 

X 1 = o{e ax ) as x —>■ -(- co . 

In other words, no matter how large b may be and no matter how small a may be (as long 
as both are positive), (log x) h tends to infinity more slowly than x". Also, x b tends to 
infinity more slowly than e ax . 

example 1. In Example 4 of Section 7.12 we showed that the behavior of e _1 A/x for x 
near 0 could not be decided by any number of applications of L’HopitaTs rule for 0/0. 
However, if we write t = 1/x, this quotient becomes t/e* and it assumes the indeterminate 
form oo/oo as /->•-)- oo. Theorem 7.11 tells us that 

lim - = 0. 

t^+oo e‘ 

Therefore, e~ 1/x lx -> 0 as x —> 0+ or, in other words, e~ l ! x — o(x) as x — »■ 0 + . 
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There are other indeterminate forms besides 0/0 and oo/oo. Some of these, denoted by 
the symbols 0 < oo, 0°, and oo°, are illustrated by the examples given below. In examples 
like these, algebraic manipulation often enables us to reduce the problem to an indeterminate 
form of the type 0/0 or oo/oo which may be handled by L’HopitaTs rule, by polynomial 
approximation, or by Theorem 7.11. 

example 2. (0 > oo). Prove that lim a ._ >04 _ x 1 log x ~ 0 for each fixed a > 0. 


Solution. Writing t = 1/x, we find that x a log x = -(log t)jt a and, by (7.31), this tends 

to 0 as t -* +oo. 

EXAMPLE 3. (0°). Show that limj.^ 0+ x x = 1. 

Solution. Since x x = e xlogx , by continuity of the exponential function we have 


lim x x = exp (lim x log x) , 
s->o+ a->o+ 

if the last limit exists. But by Example 2 we know that x log x — > 0 as x — »• 0+, and hence 
x x -*e° = 1. 

example 4. (oo°). Show that lim a ,^ +x x 1/x = 1, 

Solution. Put t - l/x and use the result of Example 3. 

In Section 7.10 we proved the limit relations 

(7.33) lim (1 + ax) 1/x = e a and lim (1 + x)" ,x = e a . 

£C~>0 

Each of these is an indeterminate form of the type 1 “.We may replace x by l/x in these 
formulas and obtain, respectively, 


lim (l + -T = e a and lim (l + - ) = e“ , 

£C— > + 00 \ X/ #“> + 00 \ X/ 

both of which are valid for all real a. 

The relations (7.33) and those in Examples 2, 3, and 4 are all of the type f(x) 0{x) . These 
are usually dealt with by writing 


f{x) gW 


p g(x) log f(x) 


and then treating the exponent g(x) log f(x) by one of the methods discussed earlier. 
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7.17 Exercises 


Evaluate the limits in Exercises 1 through 25. The letters a and b denote positive constants. 


e - l /* 2 

1 


sin (1 jx) 


2. lim , 

x-*+«> arctan (1 /x) 


3. lim 


tan 3x 

tan x ' 


4. Lim lo <S ( “ + ** 


x^+00 \/fl + t>x 2 

/ 1 1 

5. lim x 4 cos 1 +-—5 

x — ►+ °o \ * 2x 2 


6. lim log,sM 


, log | sin 2jc| . 


log (1 — 2x) 

7. lim , 

*- 1 - tan 7 rx 

cosh (x + 1) 


8. lim 


9. Urn 

3— +« 

10. Urn 

x-*hn 


*~ > 0+ 


0 

e x 


a x 



35 > 
; X 


a 

tan 

x — 

5 

sec 

X + 

4 

1 

(A 


Vx 

\sin x 


a > 1. 


12. lim x 1 / 4 sin (l/y9. 
so 


25. lim 

! C ->0 


' log (x + Vl + X 2 ) log 1( + x) 

26. Find c so that 


13. lim (x 2 - X'x 4 - X 2 + 1). 


«-*+ 00 

14. lim 

a;-*0+ 


log* 

(1 + X) 2 


-Mr?-,). 


15. lim (logx) log (1 - x). 

16. lim 
*-►0+ 

17. lim [x {x * } - 1], 
a-*0+ 

18. lim (1 - 2 x Y' nx . 

X-+Q— 

1 9 i[ m ^i/iog x 
*-*■0+ 

20. lim (cot x) 8in x , 

X-+Q+ 

21. lim (tan x) tan 2x . 

a :-*-! It 


22. limj log- "i . 
*-hhA */ 


23. lim x e/(1+1 °8 *>. 
a:-»0+ 

24. lim (2 - jujtan(i*/2)_ 


lim 

as — ►-+- co 




27. Prove that (1 -f x) 1 - 1 + ex + o(x) as x (). u se this to compute the limit of 

{(x 4 + x 2 ) 1/2 — x 2 } as x -> + 00 , 

28. For a certain value of c, the limit 

lim {(x 5 + 7x 4 + 2) c — x} 
x-*+x 


is finite and nonzero. Determine this c and compute the value of the limit. 
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29. Let g(x) = xe* 1 and let f(x) = J* g(t)(t -f \jt) dt. Compute the limit of f"(x)lg"(x) as 
X-+ + CO. 

30. Let g(x) = x c e lx and let f(x) = J* e 2t (3t 2 + l) 1 ' 2 dt. For a certain value of c, the limit of 
f'(x)lg'(x) as *-+ + oo is finite and nonzero. Determine c and compute the value of the limit. 

31. Let f(x) = e~ llx * if x # 0, and let f(0) = 0. 

(a) Prove that for every m > 0 ,f(x)/x m -> 0 as x 0. 

(b) Prove that for x ^0 the nth derivative off has the form / <n) (;c) = f(x)P( \jx), where P(f) 
is a polynomial in t. 

(c) Prove that J (rl) (0) = 0 for all n > L This shows that every Taylor polynomial generated 
by f at 0 is the zero polynomial. 

32. An amount of P dollars is deposited in a bank which pays interest at a rate r per year, com- 
pounded m times a year. (For example, r = 0.06 when the annual rate is 6%.) (a) Prove that 
the total amount of principal plus interest at the end of n years is P(1 + rjtri) mn . If r and n 


are kept fixed, this amount approaches the limit Pg rn as m -> + <x>, This motivates the follow- 
ing definition: We say that money grows at an annual rate r when compounded continuously 
if the amount f(t) after t years is f(Q)e rt , where t is any nonnegative real number. Approxi- 
mately how long does h take for a bank account to double in value if it receives interest at an 


annual rate of 6 % compounded (b) continuously? (c) four times a year? 
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INTRODUCTION TO 


DIFFERENTIAL EQUATIONS 


8.1 Introduction 

A large variety of scientific problems arise in which one tries to determine something 
from its rate of change. For example, we could try to compute the position of a moving 
particle from a knowledge of its velocity or acceleration. Or a radioactive substance may 
be disintegrating at a known rate and we may be required to determine the amount of 
material present after a given time. In examples like these, we are trying to determine an 
unknown function from prescribed information expressed in the form of an equation 
involving at least one of the derivatives of the unknown function. These equations are 
called differential equations, and their study forms one of the most challenging branches 
of mathematics. 

Differential equations are classified under two main headings: ordinary and partial, 
depending on whether the unknown is a function of just one variable or of two or more 
variables. A simple example of an ordinary differential equation is the relation 

(8.1) /'(*)=/(*) 

which is satisfied, in particular, by the exponential function, fix) = e x . We shall see 
presently that every solution of (8.1) must be of the form f(x) = Ce x , where C may be any 
constant. 

On the other hand, an equation like 

d 2 /Qc, y) + d 2 f(x, y) » 
dx 2 dy 2 

is an example of a partial differential equation. This particular one, called Laplace's 
equation, appears in the theory of electricity and magnetism, fluid mechanics, and else- 
where. It has many different kinds of solutions, among which are f(x, y) = x + 2 y, 
f{x, y) - e :r cos y, and /(.¥,- y) = log (x 2 + y 2 ). 

The study of differential equations is one part of mathematics that, perhaps more than 
any other, has been directly inspired by mechanics, astronomy, and mathematical physics. 
Its history began in the 17th Century when Newton, Leibniz, and the Bernoullis solved 
some simple differential equations arising from problems in geometry and mechanics. 
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These early discoveries, beginning about 1690, gradually led to the development of a H0W- 
classic “bag of tricks’' for solving certain special kinds of differential equations. Although 
these special tricks are applicable in relatively few cases, they do enable us to solve many 
differential equations that arise in mechanics and geometry, so their study is of practical 
importance. Some of these special methods and some of the problems which they help us 
solve are discussed near the end of this chapter. 

Experience has shown that it is difficult to obtain mathematical theories of much 
generality about solutions of differential equations, except for a few types. Among these 
are the so-called linear differential equations which occur in a great variety of scientific 
problems. The simplest types of linear differential equations and some of their applications 
are also discussed in this introductory chapter. A more thorough study of linear equations 
is carried out in Volume II. 


8.2 Terminology and notation 

When we work with a differential equation such as (8.1), it is customary to write y in 
place off(x) and y’ in place off (x), the higher derivatives being denoted by y”, y”‘, etc. 
Of course, other letters such as ll, v , z, etc. are also used instead of y. By the order of an 
equation is meant the order of the highest derivative which appears. For example, (8.1) 
is a first-order equation which may be written as y’ = y. The differential equation 
y’ = x 3 y + sin (xy") is one of second order. 

In this chapter we shall begin our study with first-order equations which can be solved 
for y’ and written as follows: 

(8.2) y' — f(x, y) , 

where the expression/)*, y) on the right has various special forms. A differentiable function 
y = Y(x) will be called a solution of (8.2) on an interval I if the function Y and its derivative 
Y’ satisfy the relation 


r(x) = f[x, y(x)] 

for every x in Z. The simplest case occurs when f(x, y) is independent of y. In this case, 

(8.2) becomes 

(8.3) Y' = Q( x ) , 

say, where Q is assumed to be a given function defined on some interval /, To solve the 
differential equation (8.3) means to find a primitive of Q. The second fundamental theorem 
of calculus tells us how to do it when Q is continuous on an open interval Z. We simply 
integrate Q and add any constant. Thus, every solution of (8.3) is included in the formula 

(8.4) y = | Q(x ) dx + C , 

where C is any constant (usually called an arbitrary constant of integration). The differential 
equation (8.3) has infinitely many solutions, One for each value of C. 

If it is not possible to evaluate the integral in (8.4) in terms of familiar functions, such 
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as polynomials, rational functions, trigonometric and inverse trigonometric functions, 
logarithms, and exponentials, still we consider the differential equation as having been 
solved if the solution can be expressed in terms of integrals of known functions. In actual 
practice, there are various methods for obtaining approximate evaluations of integrals 
which lead to useful information about the solution. Automatic high-speed computing 
machines are often designed with this kind of problem in mind. 

example. Linear motion determined from the velocity. Suppose a particle moves along a 
straight line in such a way that its velocity at time / is 2 sin t. Determine its position at 
time t. 

Solution. If Y(t) denotes the position at time t measured from some starting point, then 
the derivative Y’(t) represents the velocity at time t. We are given that 

Y'(t) z= 2 sin t . 

Integrating, we find that 

Y(t ) = 2 | sin t dt + C = -2 cos t + C . 

This is all we can deduce about Y(t) from a knowledge of the velocity alone; some other 
piece of information is needed to fix the position function. We can determine C if we know 
the value of Y at some particular instant. For example, if Y(0) = 0, then C = 2 and the 
position function is Y(t) = 2 — 2 cos t. But if Y(0) = 2, then C = 4 and the position 
function is Y(t) = 4 — 2 cos t. 

In some respects the example just solved is typical of what happens in general. Some- 
where in the process of solving a first-order differential equation, an integration is required 
to remove the derivative y’ and in this step an arbitrary constant C appears. The way in 
which the arbitrary constant C enters into the solution will depend on the nature of the 
given differential equation. It may appear as an additive constant, as in Equation (8.4), 
but it is more likely to appear in some other way. For example, when we solve the equation 
y’ ?= y in Section 8.3, we shall find that every solution has the form y = Ce”. 

In many problems it is necessary to select from the collection of all solutions one having 
a prescribed value at some point. The prescribed value is called an initial condition, and 
the problem of determining such a solution is called an initial-value problem. This 
terminology originated in mechanics where, as in the above example, the prescribed value 
represents the displacement at some initial time. 

We shall begin our study of differential equations with an important special case. 


8.3 A first-order differential equation for the exponential function 

The exponential function is equal to its own derivative, and the same is true of any 
constant multiple of the exponential. It is easy to show that these are the only functions 
that satisfy this property on the whole real axis. 

theorem 8.1. If C is a given real number, there is one and only one function f which 
satisfies the differential equation 


/'(*) =f(x) 
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for all real x and which also satisfies the initial condition f(0) = C. This function is given 
by the formula 

f(x) = Ce". 

Proof. It is easy to verify that the function f (x) = Ce” satisfies both the given differential 
equation and the given initial condition. Now we must show that this is the only solution. 
Let y = g(x) be any solution of this initial-value problem: 

g'{x) = g(x) for all X, g(0) = C. 

We wish to show that g(x) = Ce” or that g(pc)e~ x = C. We consider the function h(x) = 
g(x)e~ x and show that its derivative is always zero. The derivative of h is given by 

h'{x) = g'(x)e~ x - g(x)e~ x = e~ x [g'(x) - g(x)) = 0 ■ 

Hence, by the zero-derivative theorem, h is constant. But g(0) = C so h(0) = g(0)e° = C. . 
Hence, we have h(x) = C for all x which means that g(x) = Ce”, as required. 

Theorem 8.1 is an example of an existence-uniqueness theorem. It tells us that the given 
initial-value problem has a solution (existence) and that it has only one solution (uniqueness). 
The object of much of the research in the theory of differential equations is to discover 
existence and uniqueness theorems for wide classes of equations. 

We discuss next an important type which includes both the differential equation y’ = Q(x) 
and the equation y’ = y as special cases. 


8.4 First-order linear differential equations 

A differential equation of the form 

(8.5) Y ’ + P(x)y = Q(x) , 

where P and Q are given functions, is called a first-order linear differential equation. The 
terms involving the unknown function y and its derivative y’ appear as a linear combination 
of y and y’. The functions P and Q are assumed to be continuous on some open interval I. 
We seek all solutions y defined on Z. 

First we consider the special case in which the right member, Q(x), is identically zero. 
The equation 

(8.6) y 1 + P{x)y = 0 

is called the homogeneous or reduced equation corresponding to (8.5). We will show how 
to solve the homogeneous equation and then use the result to help us solve the non- 
homogeneous equation (8.5). 

If y is nonzero on I, Equation (8.6) is equivalent to the equation 



(8.7) 
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That is, every nonzero y which satisfies (8.6) also satisfies (8.7) and vice versa. Now suppose 
y is a positive function satisfying (8.7). Since the quotient y'jy is the derivative of log y, 
Equation (8.7) becomes D log y = -P(x), from which we find log y = — JP(x) dx + C, 
so we have 

(8.8) y = e~ A(x> , where A(x) = J P(x) dx - C 

In other words, if there is a positive solution of (8.6), it must necessarily have the form 

(8.8) for some C. But now it is easy to verify that every function in (8.8) is a solution of 
the homogeneous equation (8.6). In fact, we have 

y' — —e~ Aix) A'(x) — —P(x)e~ AM = —P(x)y . 

Thus, we have found all positive solutions of (8.6). But now it is easy to describe all 
solutions. We state the result as an existence-uniqueness theorem. 

theorem 8.2. Assume P is continuous on an open interval Z. Choose any point a in Z 
and let b be any real number. Then there is one and only one function y - f (x) which satisfies 
the initial-value problem 

(8.9) y’ + p ( x )y = 0, with f( a ) = b , 
on the interval Z. This junction is given by the formula 

(8.10) fix) = be~ Aix) , where A(x) = f P(t) dt . 

J a 

Proof. Let / be defined by (8.10). Then A (a) = 0 so f(a) = be 0 = b. Differentiation 
shows that f satisfies the differential equation in (8.9), so / is a solution of the initial-value 
problem. Now we must show that it is the only solution. 

Let g be an arbitrary solution. We wish to show that g(x) = be~ A 1x1 or that g(x)e A(x> = b. 
Therefore it is natural to introduce h(x) = g(x)e Aw . The derivative of h is given by 

(8.11) h’(x) = g \x)e Atx) + g(x)e Mx) A'(x) = ^ (e, [g'(x) + P(x)g(x)] . 

Now since g satisfies the differential equation in (8.9), we have g’(x) + P(x)g(x) = 0 
everywhere on Z, so h'(x) = 0 for all x in Z. This means that h is constant oil Z. Hence, 
we have h(x) = h(a) = g(a)e Al ' a) = g( a) = b. In other words, g(x)e AI - x) = b, so g(x) = 
be~ A ^ x \ which shows that g = f This completes the proof. 

The last part of the foregoing proof suggests a method for solving the nonhomogeneous 
differential equation in (8.5). Suppose that g is any function satisfying (8.5), and let 
h(x) = g(x)e A{x) where, as above, A(x) = P(t) dt. Then Equation (8.11) is again valid, 
but since g satisfies (8.5), the formula for h’(x) gives us 


h’(x) = e A(x) Q(x). 
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Now we may invoke the second fundamental theorem to write 

h(x) = h(u) + [ e AU) Q(t)dt. 

Hence, since h(a) = g(a) every solution g of (8.5) has the form 


(8.12) 9(x) = e -^ (:c) h(x) = g(a)e~ A(x) + e~ Mx) f Q(t)e Ait) dt . 

*’(1 

Conversely, by direct differentiation of (8.12), it is easy to verify that each such g is a 
solution of (8.5), so we have found all solutions. We state the result as follows. 


theorem 8.3. Assume P and Q are continuous on an open interval /. Choose anypoint 
a in I and let b be any real number. Then there is one and only one functions =f(x) which 
satisfies the initial-value problem 

Y’ + P(x)y = Q(x), with f(a) = b , 
on the interval 1. Thisfunction is given by the formula 


f(x) = be 


-Aix) 


„—A(x) 


J> 


,AU) 


dt , 


where A(x) = j* P(t) dt. 


Up to now the word “interval” has meant a bounded interval of the form (a, b), [a, b], 
[a, b ), or (a, b], with a < b. It is convenient to consider also unbounded intervals. They 
are denoted by the symbols (a, + oo), (- oo, a), [a, + oo) and (- oo, a], and they are 
defined as follows: 

(a, + oo) = {x x > a) , (— co, a) = {x \ x < a) , 

[a, + co) = {x x > a] , (— co, a] = {x \ x < a} . 

In addition, it is convenient to refer to the collection of all real numbers as the interval 
(— oo, + co). Thus, when we discuss a differential equation or its solution over an interval 
/, it will be understood that } is one of the nine types just described. 

example. Find all solutions of the first-order differential equation xy’ + (1 — x)y = e 2x 
on the interval (0, + co). 

Solution. First we transform the equation to the form y’ + P(x)y = Q(x) by dividing 
through by x. This gives us 
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so P(x) = 1/x — 1 and Q(x) = e 2x /x. Since P and Q are continuous on the interval 
(0, + co), there is a unique solution y = f(x) satisfying any given initial condition of the 
form/(fl) = b. We shall express all solutions in terms of the initial value at the point a = l, 
In other words, given any real number b, we will determine all solutions for which/( 1) = b. 
First we compute 

A(x) = J P(t) dt = J ^ — 1 j dt = log x — (x — 1) . 

Hence we have e~ A ^ x) = e x -t- lo s x — e^jx, and = te 1 ~ t , so Theorem 8.3 tells us 
that the solution is given by the formula 


m = b 


e^ + 

x 

„X— 1 


e*" 1 f x e 21 

* J i t 


te 1 "' dt = b - — + — 

X X 


/• 


<?' dt 


u- . e \ x * U ^ , e * x e x+1 

= b + ~( e x - e ) = b H . 

xx xxx 


We can also write this in the form 


p 2x + Ce” 

f(x) e = + ce : 


where C = — e. This gives all solutions on the interval (0, + oo). 

It may be of interest to study the behavior of the solutions as x — > 0. If we approximate 
the exponential by its linear Taylor polynomial, we find that e 2x = 1 + 2x + o(x ) and 
e x = 1 + x + o(x) as x — >• 0, SO we have 


m 


_ (1 + C) + (2 + C)x + o(x ) 1 + C 


-(2 + C) + o(l) 


Therefore, only the solution with C = — 1 tends to a finite limit as x — >■ 0, this limit being 1. 


8.5 Exercises 

In each of Exercises 1 through 5, solve the initial-value problem on the specified interval. 

1. y’ — 3j = e 2x on ( — go, + oo), with y = 0 when x = 0. 

2. xy' — 2y = x 5 on (0, + oo), with y = 1 when x = 1. 

3. y’ + y tan x = sin 2x on ( — |t 7, with y = 2 when % = 0. 

4. y' + xy = x s on ( — oo, + oo), with y — 0 when x =0. 
dx 

5. v x= e 2t on ( — oo, + oo), with x = 1 when t =0. 

dt 

6. Find all solutions of y’ sin x + y cos x = 1 on the interval (0, tt). Prove that exactly one of 
these solutions has a finite limit as x — > 0, and another has a finite limit as x — 

7. Find all solutions of x(x + 1 )y' + y = x(x + \) 2 e~ x2 on the interval ( — 1,0), Prove that all 
solutions approach 0 as x — > — 1 , but that only one of them has a finite limit as x — > 0. 

8. Find all solutions of y’ + y cot x = 2 cos x on the interval (0, tt). Prove that exactly one of 
these is also a solution on(— co, + oo). 
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9. Find all solutions of (x — 2)(x — 3 )y' + 2y = (x — l)(x -- 2) on each of the following 
intervals: (a) (— oo, 2); (b) (2, 3); (c) (3, + co). Prove that all solutions tend to a finite limit 
as x ~> 2, but that none has a finite limit as x — > 3. 

10. Let s(x) = (sin x)jx if x 0, and let j(0) = 1. Define T(x) = JS s(t) dt. Prove that the 
function f(x) = xT(x) satisfies the differential equation xy’ — y = x sin x on the interval 
(—co, + oo) and find all solutions on this interval. Prove that the differential equation has 
no solution satisfying the initial condition f(Q) = 1, and explain why this does not contradict 
Theorem 8.3. 

1 1 . Prove that there is exactly one function f, continuous on the positive real axis, such that 

m = 1 + - 1 ^ 

for all x > 0 and find this function. 

12. The function / defined by the equation 

fix) = xe (1 -* 2)/2 - xe -^ J 8 rV 2/2 dt 

for x > 0 has the properties that (i) it is continuous on the positive real axis, and (ii) it satisfies 
the equation 

f(x) = 1 —x J*/W dt 

for all x > 0. Find all functions with these two properties. 

The Bernoulli equation. A differential equation of the form y’ + P(x)y = Q(x)y n , where /j is 
not 0 or 1. is called a Bernoulli equation. This equation is nonlinear because of the presence of y n , 
The next exercise shows that it can always be transformed into a linear first-order equation for a 
new unknown function v, where v = y k ,k = 1 —n. 

13. Let k be a nonzero constant. Assume P and Q are continuous on an interval Z. If a £ Z and 
if b is any real number, let v = g(x) be the unique solution of the initital-value problem 
v’ + kP(x)v = kQ(x ) on Z, with g(u) = b. If Ii z 6 1 and k = 1 — n, prove that a function 
y = fix), which is never zero on Z, is a solution of the initial-value problem 

y’ + P{x)y - Q(x)y n on I, with [(a)” = b 

if and only if the kth power off is equal to g on Z. 

In each of Exercises 14 through 17, solve the initial-value problem on the specified interval. 

14. y’ — Ay = 2e 3 j 1/2 on ( — co, + co), with y = 2 when x =0. 

15. y’ -y = -y 2 (x 2 + x+ l)on(-oo, + oo), withj = 1 whenx =0. 

16. xy’ —2 y= 4x 3 j 1/2 on(— co, +oo), withj =Owhenx = 1. 

17. xy’ +y = j 2 x 2 log x on (0, + co), with y = \ when x = L 

18. 2 xyy' + (1 + x)y 2 = e x on (0, + co), with (a) y = yfl when x = 1; (b) y = —\Te when x = U 
(c) a finite limit as x ->-0. 

19. An equation of the form y’ + P(x)y + Q(x)y l = R(x) is called a Riccati equation. (There 
is no known method for solving the general Riccati equation.) Prove that if u is a known 
solution of this equation, then there are further solutions of the form y = « + 1/c, where ^ 
satisfies a first-order linear equation. 
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20. The Riccati equation y’ + y+ y 2 = 2 has two constant solutions. Start with each of these 
and use Exercise 19 to find further solutions as follows: (a) If -2 < b< 1, find a solution on 
(— co, + oo ) for which y = h when x = 0. (b) If b > 1 or b < -2, find a solution on the interval 
(-oo, +oo) for which y = b when x =0. 

8.6 Some physical problems leading to first-order linear differential equations 

In this section we will discuss various physical problems that can be formulated mathe- 
matically as differential equations. In each case, the differential equation represents an 
idealized simplification of the physical problem and is called a mathematical model of 
the problem. The differential equation occurs as a translation of some physical law, such 
as Newton’s second law of motion, a “conservation” law, etc. Our purpose here is not to 
justify the choice of the mathematical model but rather -to deduce logical consequences 
from it. Each model is only an approximation to reality, and its justification properly 
belongs to the science from which the problem emanates. If intuition or experimental 
evidence agrees with the results deduced mathematically, then we feel that the model is a 
useful one. If not, we try to find a more suitable model. 

EXAMPLE 1. Radioactive decay. Although various radioactive elements show marked 
differences in their rates of decay, they all seem to share a common property-the rate at 
which a given substance decomposes at any instant is proportional to the amount present 
at that instant. If we denote by y = /(f ) the amount present at time t, the derivative y’ = 
f'(t) represents the rate of change of y at time t, and the “law of decay” states that 

y’ = -ky , 

where k is a positive constant (called the decay constant) whose actual value depends on 
the particular element that is decomposing. The minus sign cornes in because y decreases 
as t increases, and hence y’ is always negative. The differential equation y’ = -ky is the 
mathematical model used for problems concerning radioactive decay. Every solution 
y = f(t) of this (ifferential equation has the form 

(8.13) fft) =/( 0)e~ kt . 

Therefore, to determine the amount present at time t, we need to know the initial amount 
/( 0) and the value of the decay constant k. 

It is interesting to see what information can be deduced from (8.13), without knowing the 
exact value of/(0) or of k. First we observe that there is no finite time { at whichfft) will 
be zero because the exponential e~ kt never vanishes. Therefore, it is not useful to study 
the “total lifetime” of a radioactive substance. However, it is possible to determine the 
time required for any particular fraction of a sample to decay. The fraction | is usually 
chosen for convenience and the time T at which f(T)jf( 0) = | is called the half-life of the 
substance. This can be determined by solving the equation e~ kT = \ for T. Taking 
logarithms, we get -kT = —log 2 or T - (log 2 )jk. This equation relates the half-life 
to the decay constant. Since we have 

/(f + T) _ f(Q)e~ kU+Ti _ _ tT _l 

f(t) /(OK*' e 2' 
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we see that the half-life is the Same for every sample of a given material. Figure 8.1 illustrates 
the general shape of a radioactive decay curve, 

EXAMPLE 2. Falling body in a resisting medium. A body of mass m is dropped from 
rest from a great height in the earth’s atmosphere. Assume that it falls in a straight line 
and that the only forces acting on it are the earth’s gravitational attraction (mg, where g is 
the acceleration due to gravity, assumed to be constant) and a resisting force (due to air 
resistance) which is proportional to its velocity. It is required to discuss the resulting 
motion. 

Let $ = f(t) denote the distance the body has fallen at time t and let v= s' = f(t) denote 
its velocity. The assumption that it falls from rest means that /'( 0) = 0. 

There are two forces acting on the body, a downward force mg (due to its weight) and 
an upward force — kv (due to air resistance), where k is some positive constant. Newton’s 
second law states that the net sum of the forces acting on the body at any instant is equal 
to the product of its mass m and its acceleration. If we denote the acceleration at time t 
by a, then a = v' = s" and Newton’s law gives us the equation 

m a = m g - kv . 

This can be considered as a second-order differential equation for the displacement s or 
as a first-order equation for the velocity v. As a first-order equation for v, it is linear and 
Can be written in the form 

v + — v = g . 

m 

This equation is the mathematical model of the problem. Since v = 0 when t = 0, the 
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unique solution of the differential equation is given by the formula 


(8.14) v = e ~ k ‘ /m J <j W u/m du=~ (1- e~ Mlm ) ■ 

Note that v — > mgjk as t — >+oo. If we differentiate Equation (8.14) we find that the 
acceleration at every instant is a = ge~ kt ! m . Note that a — > 0 as t —> + oo. Interpreted 
physically, this means that the air resistance tends to balance out the force of gravity. 

Since v = s’, Equation (8.14) is itself a differential equation for the displacement s, and 
it may be integrated directly to give 


s^t + ^e^ + C 


Since s = 0 when t = 0, we find that C = —gm^jk 2 and the equation of motion becomes 


mg 4 ,gm - ki/m 


1). 


If the initial velocity is v 0 when f = 0, formula (8.14) for the velocity at time t must be 
replaced by 


m g n 

» = -(!- 


- ktjm ^ + 


v n e 


- kt/m 


It is interesting to note that for every initial velocity (positive, negative, or zero), the limiting 
velocity, as / increases without bound, is mgjk, a number independent of V 0 . The reader 
should convince himself, on physical grounds, that this seems reasonable. 

example 3. A cooling problem. The rate at which a body changes temperature is pro- 
portional to the difference between its temperature and that of the surrounding medium. 
(This is called Newton's law of cooling.) If y —f{t) is the (unknown) temperature of the 
body at time t and if M(t) denotes the (known) temperature of the surrounding medium, 
Newton’s law leads to the differential equation 

(8.15) Y ’ = -k[y - M(t)\ or y’ + ky = kM{t) , 

where k is a positive constant. This first-order linear equation is the mathematical model 
we use for cooling problems. The unique solution of the equation satisfying the initial 
condition f(a)= b is given by the formula 

(8.16) f(t) - be~ kt + e~ kt f‘ kM(u)e ku du . 

* a 

Consider now a specific problem in which a body cools from 200” to 100” in 40 minutes 
while immersed in a medium whose temperature is kept constant, say M(t) = 10”. If we 
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measure t in minutes and f(t) in degrees, we have/(0) = 200 and Equation (8.16) gives us 

(8.17) f(t) = 200e"' c '+ me~ ki P e*“ du 

J (0 

= 200e“ i ' < + 10(1 - e kt ) = 10 + l%e~ kt . 

We can compute k from the information thatf(40) = 100. Putting ) = 40 in (8.17), we 
find 90 = 190e- 40fc , so -40k = log (90/190), k = -J 0 (log 19 - log 9). 

Next, let us compute the time required for this same material to cool from 200” to 
100” if the temperature of the medium is kept at 5”. Then Equation (8.16) is valid with the 
same constant k but with M(u) = 5. Instead of (8.17), we get the formula 

f(t) = 5 + 195<r* { . 

To find the time t for which f(t) = 100, we get 95 = 195e~ kt , SO -kt = log (95/195) = 
log (19/39), and hence 

t = jr (log 39 — log 19) = 40 log 39 — log 19 

log 19 — log 9 

From a four-place table of natural logarithms, we find log 39 = 3.6636, log 19 = 2.9444, 
and log 9 = 2.1972 so, with slide-rule accuracy, we get / = 40(0.7 1 9)/(0.747) = 38.5 
minutes. 

The differential equation in (8.15) tells us that the rate of cooling decreases considerably 
as the temperature of the body begins to approach the temperature of the medium. Tt> 
illustrate, let us find the time required to cool the same substance from 100” to 10” with 
the medium kept at 5”. The calculation leads to log (5/95) = —kt, or 

t= -log 19 = 40 log ^ 40(2.944)= 158 minutes 

k log 19 - log 9 = 0.747 

Note that the temperature drop from 100” to 10” takes more than four times as long as the 
change from 200” to 100”. 

example 4. A dilutionproblem. A tank contains 100 gallons of brine whose concentration 
is 2.5 pounds of salt per gallon. Brine containing 2 pounds of salt per gallon runs into the 
tank at a rate of 5 gallons per minute and the mixture (kept uniform by stirring) runs out 
at the same rate. Find the amount of salt in the tank at every instant. 

Let y = /(0 denote the number of pounds of salt in the tank at time t minutes after 
mixing begins. There are two factors which cause y to change, the incoming brine which 
brings salt in at a rate of 10 pounds per minute and the outgoing mixture which removes salt 
at a rate of 5(_y/100) pounds per minute. (The fraction yj 100 represents the concentration 
at time t.) Hence the differential equation is 

/ = 10 — f 6 y or y’ + f 6 y = 10 . 

This linear equation is the mathematical model for our problem. Since y = 250 when 
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t — 0, the unique solution is given by the formula 

(8.18) y = 250e“ </20 + <r' /2 ° P 10e“ /2 ° du = 200 + 50e” #/20 . 

This equation shows that y > 200 for all t and that y — > 200 as t increases without bound. 
Hence, the minimum salt content is 200 pounds. (This could also have been guessed from 
the statement of the problem.) Equation (8.18) can be solved for t in terms of y to yield 

m, _■ 

This enables us to find the time at which the salt content will be a given amount y, provided 
that 200 < y < 250. 

example 5. Electric circuits. Figure 8.2(a), page 318, shows an electric circuit which 
has an electromotive force, a resistor, and an inductor connected in series. The electro- 
motive force produces a voltage which causes an electric current to flow in the circuit. 
If the reader is not familiar with electric circuits, he should not be concerned. For our 
purposes, all we need to know about the circuit is that the voltage, denoted by V(t), 
and the current, denoted by /(/), are functions of time t related by a differential equation 
of the form 

(8.19) Lift ) + RI(t) = V(t) . 

Here L and R are assumed to be positive constants. They are called, respectively, the 
inductance and resistance of the circuit. The differential equation is a mathematical form- 
ulation of a conservation law known as Kircllhojf’s voltage law, and it serves as a mathe- 
matical model for the circuit. 

Those readers unfamiliar with circuits may find it helpful to think of the current as being 
analogous to water flowing in a pipe. The electromotive force (usually a battery or a 
generator) is analogous to a pump which causes the water to flow; the resistor is analogous 
to friction in the pipe, which tends to oppose the flow; and the inductance is a stabilizing 
influence which tends to oppose sudden changes in the current due to sudden changes in 
the voltage. 

The usual type of question concerning such circuits is this: If a given voltage V(t ) is 
impressed on the circuit, what is the resulting current /(?)? Since we are dealing with a 
first-order linear differential equation, the solution is a routine matter. If Z(0) denotes the 
initial current at time t = 0, the equation has the solution 

Z(t) = /( 0)e~ nt/L + e~ RtlL f * e RxlL dx ' 

Jo L 

An important special case occurs when the impressed voltage is constant, say V(t ) = E 
for all t. In this case, the integration is easy to perform and we are led to the formula 

Kf =|+ (/(o) - 1 Y RttL . 
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Inductor 



(a) 



Figure 8.2 (a) Diagram for a simple series circuit, (b) The current resulting from 

a constant impressed voltage E. 


This shows that the nature of the solution depends on the relation between the initial 
current Z(0) and the quotient EjR. If Z(0) = Ej R, the exponential term is not present and 
the current is constant, Z(t) = EjR. If Z(0) > E/R, the coefficient of the exponential term 
is positive and the current decreases to the limiting value EjR as / — > + oo. If Z(0) < EjR, 
the current increases to the limiting value E/R. The constant EjR is called the steady-state 
current, and the exponential term [1(0) — E/R]e~ Rt ^ I ‘ is called the transient current. Exam- 
ples are illustrated in Figure 8.2(b). 

The foregoing examples illustrate the unifying power and practical utility of differential 
equations. They show how several different types of physical problems may lead to 
exactly the same type of differential equation. 

The differential equation in (8.19) is of special interest because it suggests the possibility 
of attacking a wide variety of physical problems by electrical means. F° r example, suppose 
a physical problem leads to a differential equation of the form 


/ + ay = Q , 


where a is a positive constant and Q is a known function. We can try to construct an 
electric circuit with inductance L and resistance R in the ratio RjL = a and then try to 
impress a voltage LQ on the circuit. We would then have an electric circuit with exactly the 
same mathematical model as the physical problem. Thus, we can hope to get numerical 
data about the solution of the physical problem by making measurements of current in 
the electric circuit. This idea has been used in practice and has led to the development of 
the analog computer. 
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8.7 Exercises 

In the following exercises, use an appropriate first-order differential equation as a mathematical 
model of the problem. 

1. The half-life for radium is approximately 1600 years. Find what percentage of a given quantity 
of radium disintegrates in 100 years. 

2. If a strain of bacteria grows at a rate proportional to the amount present and if the population 
doubles in one hour, by how much will it increase at the end of two hours? 

3. Denote by y = fit) the amount of a substance present at time t. Assume it disintegrates at a 
rate proportional to the amount present. If n is a positive integer, the number T for which 
f(T) = f(0)ln is called the 1/nth life of the substance. 

(a) Prove that the 1/nth life is the same for every sample of a given material, and compute T 
in terms of n and the decay constant k. 

(b) If a and b are given, prove that f can be expressed in the form 

f(t) = /(a)“’ <() /'(6) 1 - M,(f > 

and determine w(t). This shows that the amount present at time t is a weighted geometric 
mean of the amounts present at two instants t= a and t = b. 

4. A man wearing a parachute jumps from a great height. The combined weight of man and para- 
chute is 192 pounds. Let v{t ) denote his speed (in feet per second) at time t seconds after 
falling. During the first 10 seconds, before the parachute opens, assume the air resistance is 
| v(t) pounds. Thereafter, while the parachute is open, assume the resistance is 12 v(t) pounds. 
Assume the acceleration of gravity is 32 ft/sec 2 and find explicit formulas for the speed v(t) 
at time t. (You may use the approximation e~ 5/4 = 37/128 in your calculations.) 

5. Refer to Example 2 of Section 8.6. Use the chain rule to write 

fifo _ ds dv_ dv 
dt dt ' ds V ds 

and thus show that the differential equation in the example can be expressed as follows: 

ds bv 
dv ~ c - v ’ 

where b = mjk and c = gmjk. Integrate this equation to express j in terms of v. Check your 
result with the formulas for v and s derived in the example. 

6. Modify Example 2 of Section 8.6 by assuming the air resistance is proportional to v 2 . Show 
that the differential equation can be put in each of the following forms: 

ds m v dt m 1 

dv k c 2 — v 2 ’ dv k c 2 — v 2 ’ 


where c = V mgjk. Integrate each of these and obtain the following formulas for v: 

tnp — g—bt 

v * = 0 “ e~ 2ks ' m ) ; V . C-^TT — ^ctanhbt, 


where b = V kgjm. Determine the limiting value of v as { -> + oo. 
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7. A body in a room at 60” cools from 200” to 120” in half an hour. 

(a) Show that its temperature after t minutes is 60 + 140e -i< , where k = (log 7 — log 3)/30. 

(b) Show that the time / required to reach a temperature of T degrees is given by the formula 
t = [log 140 — log (T — 60 )]/&, where 60 < T < 200. 

(c) Find the time at which the temperature is 90”. 

(d) Find a formula for the temperature of the body at time t if the room temperature is not 
kept constant but falls at a rate of 1” each ten minutes. Assume the room temperature is 60” 
when the body temperature is 200”. 

8. A thermometer has been stored in a room whose temperature is 75”. Five minutes after being 
taken outdoors it reads 65”. After another five minutes, it reads 60”. Compute the outdoor 
temperature. 

9. In a tank are 100 gallons of brine containing 50 pounds of dissolved salt. Water runs into the 
tank at the rate of 3 gallons per minute, and the concentration is kept uniform by stirring. 
How much salt is in the tank at the end of one hour if the mixture runs out at a rate of 2 gallons 
per minute? 

10. Refer to Exercise 9. Suppose the bottom of the tank is covered with a mixture of salt and in- 
soluble material. Assume that the salt dissolves at a rate proportional to the difference between 
the concentration of the solution and that of a saturated solution (3 pounds of salt per gallon), 
and that if the water were fresh 1 pound of salt would dissolve per minute. Flow much salt 
will be in solution at the end of one hour? 

11. Consider an electric circuit like that in Example 5 of Section 8.6. Assume the electromotive 
force is an alternating current generator which produces a voltage V(t) = E sin wt, where E 
and to are positive constants (to is the Greek letter omega). If Z(0) = 0, prove that the current 
has the form 


m = 


V R 2 + c o 2 L 2 


sin (cot — a) + 


EcoL 


R 2 ( o 2 L 2 


„-RttL 


where a depends only on to, L, and R. Show that a = 0 when L = 0. 

12. Refer to Example 5 of Section 8.6. Assume the impressed voltage is a step function defined as 
follows: E(t) = E if a < t < b, where a > 0; E(t) = 0 for all other t. If Z(0) = 0 prove that 
the current is given by the following formulas: Z(t) = 0 if / < a; 


I(t) = - (1 - e-““ «>''') 

K 


if a < t < b ; Z(t) = —e ( e Rb/L j f t > b . 


Make a sketch indicating the nature of the graph of /, 

Population growth. In a study of the growth of a population (whether human, animal, or bac- 
terial), the function which counts the number x of individuals present at time t is necessarily a step 
function taking on only integer values. Therefore the true rate of growth dx/dt is zero (when t lies 
in an open interval where x is constant), or else the derivative dxjdt does not exist (when x jumps 
from one integer to another). Nevertheless, useful information can often be obtained if we assume 
that the population x is a continuous function of t with a continuous derivative dx/dt at each 
instant. We then postulate various “laws of growth” for the population, depending on the factors 
in the environment which may stimulate or hinder growth. 

For example, if environment has little or no effect, it seems reasonable to assume that the rate 
of growth is proportional to the amount present. The simplest kind of growth law takes the form 
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where k is a constant that depends on the particular kind of population. Conditions may develop 
which cause the factor k to change with time, and the growth law (8.20) can be generalized as 
follows : 

dx 

(8.21) ~- = k(t)x. 

at 


If, for some reason, the population cannot exceed a certain maximum M (for example, because 
the food supply may he exhausted), we may reasonably suppose that the rate of growth is jointly 
proportional to both x andM — x. Thus we have a second type of growth law: 

dx 

(8.22) — = kx(M - x) , 

where, as in (8.21), k may be constant or, more generally, k may change with time. Technological 
improvements may tend to increase or decrease the value of M slowly, and hence we can generalize 
(8.22) even further by allowing M to change with time. 

13. Express x as a function of t for each of the “growth laws” in (8.20) and (8.22) (with k and M 
both constant). Show that the result for (8.22) can be expressed as follows: 


(8.23) 



where a is a constant and t l is the time at which x = Mjl. 

14. Assume the growth law in formula (8.23) of Exercise 13, and suppose a census is taken 
at three equally spaced times , t 2 , t 3 , the resulting numbers being x 3 , A' 2 . x 3 . Show 
that this suffices to determine M and that, in fact, we have 


M = x. 


x 3 (x 2 - Xj) - X^X-j - x 2 ) 


15. Derive a formula that generalizes (8.23) of Exercise 13 for the growth law (8.22) when k is 
not necessarily constant. Express the result in terms of the time f n for which x = M/2. 

16. The Census Bureau reported the following population figures (in millions) for the United 
States at ten-year intervals from 1790 to 1950: 3.9, 5.3, 7.2, 9.6. 12.9, 17, 23, 31, 39, 50, 63, 76, 
92, 108, 122, 135, 150. 

(a) Use Equation (8.24) to determine a value of M on the basis of the census figures for 1790, 
1850, and 1910. 

(b) Same as (a) for the years 1910, 1930, 1950. 

(c) On the basis of your calculations in (a) and (b), would you be inclined to accept or reject 
the growth law (8.23) for the population of the United States? 

17. (a) Plot a graph of log x as a function of t, where x denotes the population figures quoted 
in Exercise 16. Use this graph to show that the growth law (8.20) was very nearly satisfied from 
1790 to 1910. Determine a reasonable average value of k for this period. 

(b) Determine a reasonable average value of k for the period from 1920 to 1950, assume that 
the growth law (8.20) will hold for this k, and predict the United States population for the 
years 2000 and 2050. 

18. The presence of toxins in a certain medium destroys a strain of bacteria at a rate jointly pro- 
portional to the number of bacteria present and to the amount of toxin. If there were no 
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toxins present, the bacteria would grow at a rate proportional to the amount present. Let x 
denote the number of living bacteria present at time t. Assume that the amount of toxin is 
increasing at a constant rate and that the production of toxin begins at time t = 0. Set up a 
differential equation for x. Solve the differential equation. One of the curves shown in Figure 
8.3 best represents the general behavior of x as a function of t. State your choice and explain 
your reasoning. 


8.8 Linear equations of second order with constant coefficients 

A differential equation of the form 

y" + PiW + Pix)y = R{x) 

is said to be a linear equation of second order. The functions Pi and P 2 which multiply the 
unknown function y and its derivative y’ are called the coefficients of the equation. 

For first-order linear equations, we proved an existence-uniqueness theorem and deter- 
mined all solutions by an explicit formula. Although there is a corresponding existence- 
uniqueness theorem for the general second-order linear equation, there is no explicit 
formula which gives all solutions, except in some Special cases. A study of the general 
linear equation of second order is undertaken in Volume II. Here we treat only the case 
in which the coefficients P x and P 2 are constants. When the right-hand member R(x) is 
identically zero, the equation is said to be homogeneous. 

The homogeneous linear equation with constant coefficients was the first differential 
equation of a general type to be completely solved. A solution was first published by Euler 
in 1743. Apart from its historical interest, this equation arises in a great variety of applied 
problems, so its study is of practical importance. Moreover, we can give explicit formulas 
for all the solutions. 

Consider a homogeneous linear equation with constant coefficients which we write as 
follows : 

y" + ay' + by = 0 . 
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We seek solutions on the entire real axis (— oo, + co). One solution is the constant function 
y = 0. This is called the trivial solution. We are interested in finding nontrivial solutions, 
and we begin our study with some special cases for which nontrivial solutions can be found 
by inspection. In all these cases, the coefficient of y’ is zero, and the equation has the form 
y” + by = 0. We shall find that solving this special equation is tantamount to solving the 
general case. 

8.9 Existence of solutions of the equation y” + by - 0 

example 1. The equation y” = 0. Here both coefficients a and b are zero, and we can 
easily determine all solutions. Assume y is any function satisfying y” = 0 on (- oo, + oo). 
Then its derivative y’ is constant, say y’ = Cj . Integrating this relation, we find that y 
necessarily has the form 

y = c x x + c 2 , 

where c 1 and c 2 are constants. Conversely, for any choice of constants and c 2 . the linear 
polynomial y = c,.y + c 2 satisfies y” = 0, so we have found all solutions in this case. 

Next we assume that b ^ 0 and treat separately the cases b < 0 and b > 0. 

example 2. The equation y” -}- by = 0, where b < 0. Since b < 0, we can write b = —k 2 , 
where k > 0, and the differential equation takes the form 

y” = k *y ■ 

One obvious solution is y = e kx , and another is y = From these we can obtain 
further solutions by constructing linear combinations of the form 

y = c x e kx + c 2 e~ kx , 

where and c 2 are arbitrary constants. It will be shown presently, in Theorem 8.6, that 
all solutions are included in this formula. 

example 3. The equation y” + by = 0, where b > 0. Here we can write b = k 2 , where 
k > 0, and the differential equation takes the form 

/ = -k 2 y t 

Again we obtain some solutions by inspection. One solution is y = COS kx, and another 
is y = sin kx. From these we get further solutions by forming linear combinations, 

y = c x COS kx + c 2 sin kx , 

where c\ and c 2 are arbitrary constants. Theorem 8.6 will show that this formula includes 
all solutions. 
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8.10 Reduction of the general equation to the special case y” + by = 0 

The problem of solving a second-order linear equation with constant coefficients can 
be reduced to that of solving the special cases just discussed. There is a method for doing 
this that also applies to more general equations. The idea is to consider three functions 
y, u, and v such that y = uv. Differentiation gives us y’ = uv’ + u'v, and y” = uv" + 
2u!v + u'v. Now we express the combination y” + ay' + by in terms of u and v. We 
have 


(8.25) y” + ay' + by — uv" + 2 u'v' + u"v + a(uv’ + u'v) + buv 

— {v" + av + bv)u + (2v' + av)u’ + vu" , 

Next we choose V to make the coefficient of u' zero. This requires that v' = —avjl, SO we 
may choose v = e ax l 2 . For this v we have v" = —av'/l = a 2 vj 4, and the coefficient of 
U in (8.25) becomes 


„ , , , , . a v a 2 v , 4b — a 

v + av + bv = — + bv = v 

4 2 4 


Thus, Equation (8.25) reduces to 


y” + ay' + by = I u" + 


4b - 


a 3 \ 

— u\v . 


Since v = e -M / 2 , the function v is never zero, so y satisfies the differential equation y” + 
ay' -f by - 0 if and only if u satisfies u" + \{4b — a 2 )u = 0. Thus, we have proved the 
following theorem. 


theorem 8.4. Let y and u be two functions such that y = ue~ ax ^. Then, on the interval 
(- oo, + oo),y satisfies the differential equation y” + ay' + by = 0 if and only if u satisfies 
the differential equation 


+ 


4b 


U S= 0 


This theorem reduces the study of the equation y” + ay’ + by = 0 to the special case 
y” -f by = 0. We have exhibited nontrivial solutions of this equation but, except for the 
case b = 0, we have not yet shown that we have found all solutions. 


8.11 Uniqueness theorem for the equation y” + by - 0 

The problem of determining all solutions of the equation y” + by = 0 can be solved 
with the help of the following uniqueness theorem. 


theorem 8.5. Assume t\S'0 functions f and g satisfy the differential equation y” + by = 0 
on (-00, + oo). Assume also that f and g satisfy the initial conditions 


/(CI) = g(0) , /'( 0) = s'(0) . 


Then f (x) = g(x) for all x. 
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Proof. Let li(x) = fix) — g(x). W e wish to prove that h(x) = 0 for all x. We shall 
do this by expressing h in terms of its Taylor polynomial approximations. 

First we note that h is also a solution of the differential equation y” + by = 0 and satisfies 
the initial conditions h(0) =0, h’(0) =0. Now every function y satisfying the differential 
equation has derivatives of every order on (- oo, + co) and they can be computed by 
repeated differentiation of the differential equation. For example, since y” = -by, we 
have y’” = —by' , and y* 4 ) = — by" = b 2 y. By induction we find that the derivatives of 
even order are given by 

y(2n) _(_i yjjny t 

while those of odd order are j/ 2 ” _1) = (- Since h(0) and h’(0) are both 0, it 

follows that all derivatives h {n \ 0) are zero. Therefore, each Taylor polynomial generated 
by h at 0 has all its coefficients zero. 

Now we apply Taylor’s formula with remainder (Theorem 7.6), using a polynomial 
approximation of odd degree 2n — 1, and we find that 

h(x) = E in _ i(x) , 

where E. ln _fx) is the error term in Taylor’s formula. To complete the proof, we show that 
the error can be made arbitrarily small by taking n large enough. 

We use Theorem 7.7 to estimate the size of the error term. For this we need estimates 
for the size of the derivative h {2n) . Consider any finite closed interval [-c, c], where c > 0. 
Since h is continuous on this interval, it is bounded there, say |/7(x)| < M on [-c, c]. 
Since h^ n) (x) = (- l) n b n h(x), we have the estimate \h i2n) (x)\ < M |6|" on [-c, c]. Theorem 
7.7 gives us \E 2n -i(x )\ < M\b\ n x 2 "/(2n)! so, on the interval [-c, c], we have the estimate 


(8.26) 


0 < \h(x)\ < 


M \b\ n x u 
(2 n)! 


M\b\ n c Sn 
~ (2n)! 


MA 2n 
(2n)l ’ 


where A = \bf 2 c. Now we show that A m /in\ tends to 0 as m — > + 00. This is obvious if 
0< A < 1. If A > 1, we may write 

Af__AA_ A_ A A jV A 
m. 1 1 2 k k + 1 m ~ k\ \k + 1/ 

where k < m. If we choose k to be the greatest integer < A, then A < k + 1 and the last 
factor tends to 0 as m — >■ + oo. Hence A m jm \ tends to 0 as m — >■ oo, SO inequality (8.26) 
shows that h(x) = 0 for every x in [-c, c]. But, since c is arbitrary, it follows that h(x) = 0 
for all real x. This completes the proof. 

Note: Theorem 8.5 tells us that two solutions of the differential equation y” + by = 0 

which have the same value and the same derivative at 0 must agree everywhere. The choice 
of the point 0 is not essential. The same argument shows that the theorem is also true if 0 
is replaced by an arbitrary point c. In the foregoing proof, we simply use Taylor poly- 
nomial approximations at c instead of at 0. 
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8.12 Complete solution of the equation y” + by = 0 

The uniqueness theorem enables us to characterize all solutions of the differential 
equation y” + by = 0. 

theorem 8.6. Given a real number b, define two functions zq and u 2 on (— oo, + oo) as 

follows: 

(a) I f b = 0, let u,(x) = 1, u 2 (x) = x. 

(b ) If b < 0, write b = —k 2 and define uf x) = e kx , u./x) = e kx . 

(c) Zf b > 0, write b = k 2 and define ufx) = COS kx, ufx) = sin kx. 

Then every solution of the differential equation y” + by = 0 on (- CO, + oo) has the form 

( 8.27) y = Ciufx) + e 2 u 2 (x) , 
where q and C 2 are constants. 

Proof We proved in Section 8.9 that for each choice of constants q and c 2 the function 
y given in (8.27) is a solution of the equation y” + by = 0. Now we show that all solutions 
have this form. The case b = 0 was settled in Section 8.9, so we may assume that b ^ 0. 

The idea of the proof is this: Let y = fix) be any solution of y” -f by = 0. If we can 
show that constants q and c 2 exist satisfying the pair of equations 

(8.28) Cfu i(0) + c 2 uf 0) =/(0) , Cjm'(O) + c 2 u' 2 ( 0) = /'( 0) , 

then both f and qiq -|- c 2 U 2 are solutions of the differential equation y” by = 0 having 
the same value and the same derivative at 0. By the uniqueness theorem, it follows that 
f = CjU j + c 2 u 2 . 

In case (b), we have ufix) = e kx , u 2 (x) = e~ kx , so M^O) = u,(Oj = 1 and w^(0) = k, 
w'(0) = —k. Thus the equations in (8.28) become Cj + c 2 = I (0), and q — c 2 = f '(0)lk. 
They have the solution q = 1/(0) + \fff)jk, c 2 = |/(0) — \fff)lk. 

In case (c), we have ufx) = cos kx, u 2 (x) = sin kx, SO u,(0) = l, u 2 (0 ) = 0, u'f 0) = 0, 
ufO) = k, and the solutions are q = fiO), and c 2 = f'(0)jk. Since c 1 and c 2 always exist 
to satisfy (8.28), the proof is complete. 

8.13 Complete solution of the equation y” + ay' + by - 0 

Theorem 8.4 tells us that y satisfies the differential equation y” ay' + by = 0 if 
and only if u satisfies u" + j(4 b — a 2 )u = 0, where y = e~ ax l 2 U. From Theorem 8.6 we 
know that the nature of each solution u depends on the algebraic sign of the coefficient of 
U, that is, on the algebraic sign of 4b — a 2 or, alternatively, of a 2 “ 4 b. We call the number 

a 2 — 4b the discriminant of the differential equation y” + ay' + by = 0 and denote it by 
d, When we combine the results of Theorem 8.4 and 8.6 we obtain the following. 


theorem 8. 7. Let d = a 1 — 4b be the discriminant of the linear differential equation 
y” + ay' + by = 0. Then every solution of this equation on (- oo, + co) has the form 

y _ e~ ax ( 2 [cyufx) + C 2 u 2 (x)] , 


(8.29) 
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where c y and c 2 are constants, and the functions u y and u 2 are determined according to the 
algebraic sign of the discriminant as follows: 

(a) If d = 0, then ufx) = 1 and u 2 {x) = x. 

(b) If d > 0, then U y (x) = e kx and u 2 (x) = e~ kx , where Ic = \Vd. 

(c) If d < 0, then u y {x) = COS kx and ufx) = sin kx, where k = |V— d. 

Note: In case (b), where the discriminant dis positive, the solution y in (8.29) is a linear 

combination of two exponential functions, 


where 


V = e ax/ \c y e kx + c 2 e kx ) — c 1 e r i x + c 2 e raX , 


r i 



-a + Vd 
2 ’ 


r 2 



a 


— sfd 
2 ' 


The two numbers r y and r 2 have sum r y + r 2 = -a and product r y r 2 = |(a 2 «• d) = b. 
Therefore, they are the roots of the quadratic equation 

r 2 + ar +h =0. 


This is called the characteristic equation associated with the differential equation 


y” + ay' + by = 0 . 


The number d = a 2 — 4b is also called the discriminant of this quadratic equation; its 
algebraic sign determines the nature of the roots. If d > 0, the quadratic equation has real 

roots given by (-a ± Vd)l 2. If d < 0, the quadratic equation has no real roots but it 
does have complex roots r y and r 2 ■ The definition of the exponential function can be ex- 
tended so that e r i x and e r * x are meaningful when r y and r 2 are complex numbers. This ex- 
tension, described in Chapter 9, is made in such a way that the linear combination in 
(8.29) can also be written as a linear combination of e r \ x and e r * x , when r y and r 2 are 
complex. 

We conclude this section with some miscellaneous remarks. Since all the solutions of 
the differential equation y” + ay' + by = 0 are contained in formula (8.29), the linear 
combination on the right is often called the general solution of the differential equation. 
Any solution obtained by specializing the constants Cj and c 2 is called aparticular solution. 

For example, taking Cj = 1, c 2 = 0, and then c y = 0, c 2 = 1, we obtain the two particular 
solutions 

v y = e~ ax l 2 Ul (x) , v 2 = e~ ax l 2 u 2 (x) . 

These two solutions are of special importance because linear combinations of them give 
us all solutions. Any pair of solutions with this property is called a basis for the set of 
all solutions. 

A differential equation always has more than one basis. For example, the equation 
y” = 9 y has the basis v y = e 3x , v 2 = e~ 3x . But it also has the basis uq = cosh 3x, w 2 = 
sinh 3x. In fact, since e 3x = uq + W 2 and e~ 3x = wq — W’ 2 , every linear combination of e 3x 
and e _3x is also a linear combination of wq and n’ 2 . Hence, the pair wq , tv 2 is another basis. 

It can be shown that any pair of solutions iq and v 2 of a differential equation y” + 
ay' + by = 0 will be a basis if the ratio v 2 /v 1 is not constant. Although we shall not need 
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this fact, we mention it here because it is important in the theory of second-order linear 
equations with nonconstant coefficients. A proof is outlined in Exercise 23 of Section 8.14. 


8.14 Exercises 


Find all solutions of the following differential equations on ( — oo, + oo). 

1. y” — 4y = 0. 6. y” + 2 y' — 3y = 0. 

2. y" + 4y = 0. 7. y” — 2y' + 2y = 0. 

3. y" — 4y ' = 0. 8. y ” — 2 y' +5 y = 0. 

4. y” + 4y’ = 0. 9. y" + 2y’ + y = 0. 

5. y” -2y~ + 3y = 0. 10. y” - 2y’ + y = 0. 


In Exercises 11 through 14, find the particular solution satisfying the given initial conditions. 

11. 2 y" + 3 y' = 0, with y = 1 and y' — 1 when x = 0. 

12. y" + 25j = 0, with y = -1 and y' = 0 when x = 3. 

13. /' — 4/ — y = 0, with y = 2 and y' = -1 when x = 1. 

14. y” + 4 y' + 5y = 0, with y = 2 and y’ = y” when x = 0. 

15. The graph of a solution u of the differential equation y” — Ay' -)- 29 y = 0 intersects the graph 
of a solution v of the equation y” + Ay' + 1 2y = 0 at the origin. The two curves have equal 
slopes at the origin. Determine u and v if tt'fl 77 ) = 1- 

16. The graph of a solution// of the differential equation y” — 3y' Ay = 0 intersects the graph 

of a solution v of the equation y” + Ay' ™ = 0 at the origin. Determine u and v if the two 

curves have equal slopes at the origin and if 


v(xf 
hm — — 
co U(X) 


17. Find all values of the constant k such that the differential equation y” + ky = 0 has a non- 
trivial solution y = f k {x) for which f k (0) = f k { I ) = 0. For each permissible value of k, deter- 
mine the corresponding solution y = f k (x). Consider both positive and negative values of k. 

18. If (a, b) is a given point in the plane and if m is a given real number, prove that the differential 
equation y” + k 2 y = 0 has exactly one solution whose graph passes through (a, b) and has the 
slope m there. Discuss also the case k = 0. 

19. (a) Let (a l , fq) and (a 2 , b 2 ) be two points in the plane such that a i — a 2 ^ «rr, where n is an 
integer. Prove that there is exactly one solution of the differential equation y” + y = 0 whose 
graph passes through these two points. 

(b) Is the statement in part (a) ever true if a, — a 2 is a multiple of w? 

(c) Generalize the result in part (a) for the equation y” + k 2 y = 0. Discuss also the case k = 0. 

20. In each case, find a linear differential equation of second order satisfied by /q and //, . 

(a) u^x) = e x , u 2 (x) = e~ x . 

(b) u^x) = e 2x , // 2 (x) = xe 2x . 

(c) u^x) = e~ xt 2 cos x, u 2 (x) = e~ xi 2 sin x. 

(d) /q(x) = sin (2x + 1) ( « 2 ( x) = sin (2x + 2). 

(e) //j(x) = cosh x, u 2 (x) = sinh x. 


The Wronskian. Given two functions u l and u 2 ■ the function W defined by W(x) = u x {x)u' 2 (x) — 
u 2 (x)ii[(x) is called their Wronskian, after J. M. H. Wronski (1778-1853). The following exercises 
are concerned with properties of the Wronskian. 

21. (a) If the Wronskian W(x) of /q and // 2 is zero for all x in an open interval I, prove that the 
quotient u 2 jui is constant on Z. In other words, if « 2 //q is not constant on Z, then W(c) 0 
for at least one c in Z. 

(b) Prove that the derivative of the Wronskian is W’ = /qu" — u 2 u". 
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22. Let W be the Wronskian of two solutions U\.u 2 of the differential equation y” + ay' + by - 0, 
where a and b are constants. 

(a) Prove that W satisfies the first-order equation W’ + a W = 0 and hence W(x) = W( 0)e~ ax , 
This formula shows that if W(0) ^ 0, then W(x) 0 for all x. 

(b) Assume u x is not identically zero. Prove that W(0) = 0 if and only if u 2 /u 1 is constant. 

23. Let v 1 and v 2 he any two solutions of the differential equation y” + ay' + by = 0 such that 
vjv j is not constant. 

(a) Let y =f(x) be any solution of the differential equation. Use properties of the Wronskian 
to prove that constants c 1 and c 2 exist such that 

Cit^O) +c 2 r 2 (0) = /(0) , <Tf{(0) + c 2 v 2 ( 0) =/'( 0) , 

(b) Prove that every solution has the form y = c x Vj + C 2 v.> • In other words, v x and v 2 form 
a basis for the set of all solutions. 


8.15 Nonhomogeneous linear equations of second order with constant coefficients 

We turn now to a discussion of nonhomogeneous equations of the form 

(8.30) y” + ay' + by = R , 

where the coefficients a and b are constants but the right-hand member R is any function 
continuous on (- CO, + cc). The discussion may be simplified by the use of operator 
notation. For any function f with derivatives f 1 and f ", we may define an operator L 
which transforms f into another function L(f) defined by the equation 

L(f) =/" + af + bf. 

In operator notation, the differential equation (8.30) is written in the simpler form 

L(y) - R . 

It is easy to verify that L(y x + y 2 ) - L(yf) + L(y 2 ), and that L(cy) - cL(y ) for every 
constant c. Therefore, for every pair of constants Cj and C 2 , we have 

L(c,)’i + c 2 y 2 ) = CiLOq) + c 2 L(y 2 ) . 

This is called the Zinearity property of the operator L. 

Now suppose y 1 and y 2 are any two solutions of the equation L(y) - R. Since L(yf) = 
L(y 2 ) = R, linearity gives us 

L(y 2 - }>j) = L(y 2 ) - L(y\) = R - R = 0 , 

so y 2 — is a solution of the homogeneous equation L(y) = 0. Therefore, we must have 
y 2 — yj _ CiV 1 + C 2 V 2 , where cqt'j + c 2 v 2 is the general solution of the homogeneous 
equation, or 

Y 2 = e 1 v 1 + c 2 v 2 + >T • 
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This equation must be satisfied by every pair of solutions y x and y 2 °f the nonhomogeneous 
equation L(y) = R. Therefore, if we can determine one particular solution y 1 of the non- 
homogeneous equation, all solutions are contained in the formula 

(8.31) y = C 1 V 1 + C 2 V% + yi j 

where c 1 and c 2 are arbitrary constants. Each such y is clearly a solution of L(y) = R 
because L(c l v l + c 2 V 2 + jq) = L(c x v x + c 2 v 2 ) + L(y x ) = 0 + R = R. Since all solutions 
of L(y) = R are found in (8.31), the linear combination C 1 V 1 + C 2 V 2 + y x is called the general 
solution of (8.30). Thus, we have proved the following theorem. 

theorem 8.8. I f y 1 is a particular solution of the nonhomogeneous equation L(y) = R, 
the general solution is obtained by adding to y x the general solution of the corresponding 
homogeneous equation L(y) = 0. 


Theorem 8.7 tells us how to find the general solution of the homogeneous equation 
L(y) = 0. It has the form y = c 1 tq + c 2 t ; 2 , where 

(8.32) Vl (x) = e~ ax l\(x) , v 2 (x) = e- ax l*u 2 (x) , 

the functions jq and u 2 being determined by the discriminant of the equation, as described 
in Theorem 8.7. Now we show that tq and v 2 can be used to construct a particular solution 
y x of the nonhomogeneous equation L(y) = R. 

The construction involves a function W defined by the equation 

W(x) = Vx(x)v 2 (x) - v.fx)v[(x) . 

This is called the Wronskian of i\ and v 2 ; some of its properties are described in Exercises 
21 and 22 of Section 8.14. We shall need the property that W(x) is never zero. This can be 
proved by the methods outlined in the exercises or it can be verified directly for the particular 
functions /q and v 2 given in (8.32). 


theorem 8.9. Let v x and v 2 be the solutions of the equation L(y) = 0 given by (8.32), 
where L(y) = y” + ay' + by. Let W denote the Wronskian of v t and V 2 . Then the non- 
homogeneous equation L(y) = R has a particular solution y x given by the formula 


where 

(8.33) 


fi(x) = / 1 (x)y 1 (x) + t 2 (x)v 2 (x) 


h(x) 




,oo 


R{x} 

W{x) 


dx , tfx) vfx) 


=Tv 


W(x ) ' 


dx , 


Proof, Let us try to find functions q and t 2 such that the combination y, = t ] v l + t 2 V 2 
will satisfy the equation L(yf) = R. We have 


/ - fv[ + t / 2 + OX + ?X) ) 

yl = h v l + h v 2 + (K v i + + OOi + t 2 v 2 ) 1 . 
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When we form the linear combination L{\\) = y" + ay ’ + by x , the terms involving t 1 
and ; 2 drop out because of the relations L{i\) = L(v 2 ) = 0. The remaining terms give us 
the relation 

.L(yi) = (t^Dj + t 2 v 2 ) + (t]Ti + ^ 2 ^ 2 ) + ^OiTi + ^ 2 ^ 2 ) . 

We want to choose t 1 and ? 2 so that L(yf) = R. We can satisfy this equation if we choose 
?! and ? 2 so that 

t[v 1 + t 2 v 2 = 0 and t[v[ + t 2 v 2 = R . 

This is a pair of algebraic equations for t[ and ? 2 . The determinant of the system is the 
Wronskian of V 1 and V 2 . Since this is never zero, the system has a solution given by 

t[ = —v^R/W and ? 2 = v x R/W . 

Integrating these relations, we obtain Equation (8.33), thus completing the proof. 

The method by which we obtained the solution y\ is sometimes called variation of param- 
eters. It was first used by Johann Bernoulli in 1697 to solve linear equations of first order, 
and then by Lagrange in 1774 to solve linear equations of second order. 

Note: Since the functions t x and t 2 in Theorem 8.9 are expressed as indefinite integrals, 

each of them is determined only to within an additive constant. If we add a constant c x 
to ?! and a constant c 2 to ? 2 we change the function y x to a new function y 2 = y x + c 1 v 1 + 
c 2 v 2 ■ By linearity, we have 

L(y 2 ) = L(y } ) + L(c x v 2 + c 2 v 2 ) = L(y x ) , 

SO the new function y, is also a particular solution of the nonhomogeneous equation. 
example 1. Find the general solution of the equation y” + y = tan x on ( — tt/2, 7t/2). 


Solution. The functions V x and V 2 of Equation (8.32) are given by 

vfx) = COS x . v 2 (x) = sin x . 

Their Wronskian is W(x) = vfxyffx) — v 2 (x)v' 1 (x) = COS 2 x + sin 2 x = 1. Therefore Equa- 
tion (8.33) gives us 

Ij(x) = — j sm x tan x dx = sin x — log |sec x + tan x| , 

and 

? 2 (x) = j cos x tan x dx = J sinxdx = —COS X . 

Thus, a particular solution of the nonhomogeneous equation is 

JT = ? 1 (x)i’ 1 (x) + t,(x)v 2 (x) sm x COS x — cos x log |sec x + tan x| — sin x cos x 

= —cos x log |sec x + tan x| . 
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By Theorem 8.8, its general solution is 

y = Ci COS x + C 2 sin x — COS x log |sec x + tan x| . 

Although Theorem 8.9 provides a general method for determining a particular solution 
of L(y) = R. special methods are available that are often easier to apply when the function 
R has certain special forms. In the next section we describe a method that works when R 
is a polynomial or a polynomial times an exponential. 

8.16 Special methods for determining a particular solution of the nonhomogeneous equation 
y” + ay' + by - R 

CASE 1. The right-hand member R is a polynomial of degree n. If b A 0, we can always 
find a polynomial of degree n that satisfies the equation. We try a polynomial of the form 

TiW = 2 a k x k 

k= 0 

with undetermined coefficients. Substituting in the differential equation L(y) = R and 
equating coefficients of like powers ofx, we may determine a,, a, in succession. 

The method is illustrated by the following example. 

example 1. Find the general solution of the equation y” + y = x 3 . 

Solution. The general solution of the homogeneous equation y” + y = 0 is given by 
y = Ci COS x + c 2 sin x. To this we must add one particular solution of the nonhomogeneous 
equation. Since the right member is a cubic polynomial and since the coefficient of y is 
nonzero, we try to find a particular solution of the form jq(x) = Ax 3 + B.x 2 + Cx + D. 
Differentiating twice, we find that y”(x) = 6 Ax + 2B. The differential equation leads to 
the relation 

(6 Ax + 2B) + ( Ax 3 + Bx 2 + Cx + D) = x 3 . 

Equating coefficients of like powers of x, we obtain A = 1, B = 0, C = -6, and D = 0, 
so a particular solution is y,(x) = x 3 — 6x. Thus, the general solution is 

y — Cj COS x + c 2 sin x + x 3 — 6x . 

It may be of interest to compare this method with variation of parameters. Equation 
(8.33) gives us 

tj(x) — — J x 3 sin x dx = -(3x J — 6) sin x + (x 3 — 6x) cos x 

and 

t 2 (x) = | x 3 cos x dx = (3x 2 — 6) cos x + (x 3 — 6.x) sin x . 

When we form the combination t 1 v l + t 2 V 2 , we find the particular solution y,(x) = x 3 “ 6x, 
as before. In this case, the use of variation of parameters required the evaluation of the 
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integrals J.v 3 sin x dx and j.v 3 cos x dx. With the method of undetermined coefficients, no 
integration is required. 

If the coefficient b is zero, the equation y” + ay' = R cannot be satisfied by a polynomial 
of degree n, but it can be satisfied by a polynomial of degree n + 1 if a ^ 0. If both a and 
b are zero, the equation becomes y” = R; its general solution is a polynomial of degree 
n + 2 obtained by two successive integrations. 

CA SE 2. The right-hand member has the form R (x) = p(x)e nx , where p is a polynomial 
of degree n, and m is constant 

In this case the change of variable y = u(x)e rnx transforms the differential equation 
y” -f ay + by = R to a new equation, 

u" + (2m + a)u + ( m 2 + am + b)u = p . 

This is the type discussed in Case 1 so it always has a polynomial solution tq . Hence, the 
original equation has a particular solution of the form y t = u 1 (x)e mx > where jq is a poly- 
nomial. If m 2 + am + b ^6 0, the degree of (q is the same as the degree of p. If m 2 + 
am + b = 0 but 2m + 3 ^0, the degree of iq is one greater than that of p. If both 
m 2 + am + b = 0 and 2m + a = 0, the degree of tq is two greater than the degree ofp. 

EXAMPLE 2. Find a particular solution of the equation y” + y = xe 3x . 

Solution. The change of variable y = ue 3x leads to the new equation u" + 6 U + 
10 u = x. Trying tq(x) = Ax + B, we find the particular solution tq(x) = (5x — 3)/50, SO 
a particular solution of the original equation is jq = e 3x (5x — 3)/50. 

The method of undetermined coefficients can also be used if R has the form R(x) = 
p(x)e mx cos ax, or R (x) = p(x)e my sin ax, wherep is a polynomial and m and a are constants. 
In either case, there is always a particular solution of the form y,(x) = e mx [q(x) COS ax + 
r(x) sin ax], where q and r are polynomials. 


8.17 Exercises 


Find the general solution of each of the differential equations in Exercises 1 through 17. If the 
solution is not valid over the entire real axis, describe an interval over which it is valid. 


1. y" -y =x. 

2. /' — y’ = x 2 . 

3. y" + y’ = x 2 + 2x. 

4. y” - 2 y + 3y = x 3 . 

5. y” - 5/ + 4y = x 2 — 2x + 1. 

6. y” + y’ - 6y= 2x s + 5x 2 - 7x + 2. 

7. y" — 4 y = e 2x . 

8 . y ' + 4y = q~ 2x , 

17. y” + 6 y' +9 y = f(x), wheref(x) = 1 for 

18. If k is a nonzero constant, prove that the 
y j given by 


9. y” + y’ — 2y = e x . 

10. y" + y’ —2y= e 2x . 

11. y” +/ - 2y = e* + e 2x . 

12. y” -2y’ +y =x +2xe*. 

13. y" + 2/ + y = e~ x \x 2 . 

14. y” + y = cot 2 x 

15. y” - y = 2/(1 + e x ). 

16. y” + y’ "2 y= e x j( 1 + e*). 

< x < 2, and f(x) = 0 for all other x. 
equation y” — k 2 y = R(x) has a particular solution 

sinh k(x - t) dt . 


Find the general solution of the equation y” — 9 y = e? x . 
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19. If k is a nonzero constant, prove that the equation y” + k 2 y = R (x) has a particular solution 
y 1 given by 

1 p 

y 1 = - R(t) sin k(x — t) dt . 
k Jo 

Find the general solution of the equation y” + 9 y = sin 3x. 

In each of Exercises 20 through 25, determine the general solution. 

20. y" + y = sin x. 23. y" +4 y = 3x sin x. 

21. y" + y = cos x, 24. y" - 3/ = 2e 2x sin x. 

22. y" + 4y = 3x COS x. 25. y" + y = e 2x cos 3x. 

8.18 Examples of physical problems leading to linear second-order equations with constant 
coefficients 

example 1. Simple harmonic motion. Suppose a particle is constrained to move in a 
straight line with its acceleration directed toward a fixed point of the line and proportional 
to the displacement from that point. If we take the origin as the fixed point and let y be 
the displacement at time x, then the acceleration y” must be negative when y is positive, 
and positive when y is negative. Therefore we can write y” = ~k 2 y, or 

y" + k 2 y = 0 , 

where k 2 is a positive constant. This is called the differential equation of simple harmonic 
motion, It is often used as the mathematical model for the motion of a point on a vibrating 
mechanism such as a plucked string or a vibrating tuning fork. The same equation arises 
in electric circuit theory where it is called the equation of the harmonic oscillator. 

Theorem 8.6 tells us that all solutions have the form 

(8.34) y= Asinkx+ Bcoskx, 

where A and 8 are arbitrary constants. We can express the solutions in terms of the sine 
or cosine alone. For example, we can introduce new constants C and a, where 

C = VZ+i 2 and a = arctan — , 

A 

then we have (see Figure 8.4) A = C COS a, 8 = C sin a, and Equation (8.34) becomes 

y = C cos a sin kx + C sin a cos kx = C sin ( kx + a) . 

When the solution is written in this way, the constants C and a have a simple geometric 
interpretation (see Figure 8.5). The extreme values of y, which occur when sin (kx + a) = 

± 1, are ±C. When x = 0, the initial displacement is C sin a. As x increases, the particle 
oscillates between the extreme values +C and -C with period 2irjk. The angle kx + a 
is called the phase angle and a itself is called the initial value of the phase angle. 
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C sin a 
B 

A 

Figure 8.4 Figure 8.5 Simple harmonic motion. 

example 2. Damped vibrations. If a particle undergoing simple harmonic motion is 
suddenly subjected to an external force proportional to its velocity, the new motion satisfies 
a differential equation of the form 

y” + 2 cy' + k 2 y = 0 , 

where c and k 2 are constants, c ji. 0, k > 0. If c > 0, we will show that all solutions tend 
to zero as X — ► + oo. In this case, the differential equation is said to be Stable. The external 
force causes damping of the motion. If c < 0, we will show that some solutions have 
arbitrarily large absolute values as x — »■ + CO, In this case, the equation is said to be 

unstable 

Since the discriminant of the equation is d = (2c) 2 — 4k 2 = 4 (c 2 — A: 2 ), the nature of 
the solutions is determined by the relative sizes of c 2 and k 2 . The three cases d = 0, d > 0, 
and d < 0 may be analyzed as follows: 

(a) Zero discriminant: c 2 = k 2 . In this case, all solutions have the form 

y = e~ cx (A + Bx). 

If c > 0, all solutions tend to 0 as x — > + oo. This case is referred to as critical damping. 
If 8^0, each solution will change sign exactly once because of the linear factor A + Bx. 
An example is shown in Figure 8.6(a). If c < 0, each nontrivial solution tends to + 00 or 
to — oo as X-*- + 00 . 

(b) Positive discriminant: c 2 > k 2 . By Theorem 8.7 all solutions have the form 

y = e~ cx (Ae hx + Be~ hx ) = Ae {h ~ c)x + Be~ {h+t)x , 

where h = = V c 2 — k 2 . Since /; 2 = C 2 — k 2 , we have li l — c 2 < 0 SO (h ■» c)(h + c) < 0. 

Therefore, the numbers h — c and h + c have opposite signs. If c > 0, then h + C is 
positive so h — c is negative, and hence both exponentials and e~^+ c > x tend to zero 

as x- +Go. In this case, referred to as overcritical damping, all solutions tend to 0 for 
large x. An example is shown in Figure 8.6(a). Each solution can change sign at most 
once. 

If c < 0, then h — c is positive but h + c is negative. Thus, both exponentials 
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and e ~ (h+c)x tend to + cc for large x, so again there are solutions with arbitrarily large 
absolute values. 

(c) Negative discriminant: c 2 <A: 2 . In this case, all solutions have the form 

y = Ce~ cx sin (hx + «) , 

where h = = V k 2 — c 2 . If c > 0, every nontrivial solution oscillates, but the 

amplitude of the oscillation decreases to 0 as x — > + oo, This case is called undercritical 
damping and is illustrated in Figure 8.6(b). If c < 0, all nontrivial solutions take arbitrarily 
large positive and negative values as x — >- + oo. 



Figure 8.6 Damped vibrations occurring as solutions of y” + 2 cy' + k 2 y = 0, with 
c > 0, and discriminant 4(c 2 — kr). 


example 3. Electric circuits. If we insert a capacitor in the electric circuit of Example 5 
in Section 8.6, the differential equation which serves as a model for this circuit is given by 

Ll'(t) + RI(t) + j fl(t ) (It = V(t) , 

where C is a positive constant called the capacitance. Differentiation of this equation gives 
a second-order linear equation of the form 

LI"(t) + RI’(t) + - Z(t) = V’(t) . 

c 

If the impressed voltage V(t) is constant, the right member is zero and the equation takes 
the form 

z-(t)^z’(t)* ±i(,)=o . 

This is the same type of equation analyzed in Example 2 except that 2c is replaced by R/L, 
and k 2 is replaced by 1/(LC). In this case, the coefficient c is positive so the equation is 
always stable. In other words, the current Z(t) always tends to 0 as t -> + oo, The 
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terminology of Example 2 is also used here. The current is said to be critically damped 
when the discriminant is zero ( CR 2 = 4 L), overcritically damped when the discriminant 
is positive (CR 2 > 4 L), and undercritically damped when the discriminant is negative 
(I CR 2 < 4L). 

example 4. Motion of a rocket with variable mass. A rocket is propelled by burning 
fuel in a combustion chamber, allowing the products of combustion to be expelled backward. 
Assume the rocket starts from rest and moves vertically upward along a straight line. 
Designate the altitude of the rocket at time 1 by r(t), the mass of the rocket (including fuel) 
by m(t), and the velocity of the exhaust matter, relative to the rocket, by c(t). In the absence 
of external forces, the equation 

(8.35) m(t)r"(t) = m'(t)c(t ) 

is used as a mathematical model for discussing the motion. The left member, m(t)r"(t), is 
the product of the mass of the rocket and its acceleration, The right member, m'(t)c(t), is 
the accelerating force on the rocket caused by the thrust developed by the rocket engine. 
In the examples to be considered here, m(t) and c(t) are known or can be prescribed in 
terms of r(t) or its derivative r’(t) (the velocity of the rocket). Equation (8.35) then becomes 
a second-order differential equation for the position function r. 

If external forces are also present, such as gravitational attraction, then, instead of 

(8.35) , we use the equation 

(8.36) m(t)r"(t) = m'(t)c(t) + F(t) , 

where F(t) represents the sum of all external forces acting on the rocket at time t. 

Before we consider a specific example, we will give an argument which may serve to 
motivate the Equation (8.35). For this purpose we consider first a rocket that fires its 
exhaust matter intermittently, like bullets from a gun. Specifically, we consider a time 
interval [b t + h], where h is a small positive number; we assume that some exhaust 
matter is expelled at time /, and that no further exhaust matter is expelled in the half-open 
interval (b t + h]. On the basis of this assumption, we obtain a formula whose limit, as 
h -h>- 0, is Equation (8.35). 

Just before the exhaust material is expelled at time t, the rocket has mass m(t) and 
velocity v(t). At the end of the time interval [b t + h], the rocket has mass m(t + h) and 
velocity v(t + h). The mass of the expelled matter is m(t) — m(t + h), and its velocity 
during the interval is u(t) + c(t), since c(t) is the velocity of the exhaust relative to the 
rocket. Just before the exhaust material is expelled at time t, the rocket is a system with 
momentum m(t)v(t). At time t + h. this system consists of two parts, a rocket with 
momentum m(t + h)v(t + h) and exhaust matter with momentum [m(t) — m(t + /?)][n(0 + 
c(t) ]. The law of conservation of momentum states that the momentum of the new system 
must be equal to that of the old. Therefore, we have 

m(t)v(t) = m(t + h)v(t + h) + [m(t) — m(t + h)][v(t) + c(f)] , 

from which we obtain 


m(t + h)[v(t + li) — v(t)] = [ m(t + h) — m(t)\c(t) . 
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Dividing by h and letting h — > 0, we find that 

m(t)?/(t) = m'(t)c(t ) , 
which is equivalent to Equation (8.35). 

Consider a special case in which the rocket starts from rest with an initial weight of 
W pounds (including ft pounds of fuel) and moves vertically upward along a straight line. 
Assume the fuel is consumed at a constant rate of k pounds per second and that the products 
of combustion are discharged directly backward with a constant speed of c feet per second 
relative to the rocket. Assume the only external force acting on the rocket is the earth’s 
gravitational attraction. We want to know how high the rocket will travel before all its 
fuel is consumed. 

Since all the fuel is consumed when kt = b, we restrict t to the interval 0 < t < bjk. 
The only external force acting on the rocket is —m(t)g, the velocity c(t) = ~c, so Equation 
(8.36) becomes 

m(t)r"(t ) = —m'(t)c — m(t)g . 

The weight of the rocket at time r is h ! — kt, and its mass m(t) is (w — kt)/g; hence we have 
tn’(t) — —k/g and the foregoing equation becomes 


r"(t) = 


m'jt) _ kc _ 

m(t ) ^ w - ’k t 


8 ■ 


Integrating, and using the initial condition r’(0) =0, we find 


r'(t) = —c log - — — gt ■ 

w 


Integrating again and using the initial condition r(0) = 0, we obtain the relation 


r(t) = 


c(w — kt) 


log 


w - k t 1 


i- 


gt + Ct . 


k w 2 

All the fuel is consumed when t = bjk. At that instant the altitude is 

(SJ7) = + 


This formula is valid if b < w. For some rockets, the weight of the carrier is negligible 
compared to the weight of the fuel, and it is of interest to consider the limiting case b = m\ 
We cannot put ft = w in (8.37) because of the presence of the term log (w — b)jw. However, 
if we let ft -> w, the first term in (8.37) is an indeterminate form with limit 0. Therefore, 
when ft — > w, the limiting value of the right member of (8.37) is 





where T = wjk is the time required for the entire weight w to be consumed. 
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8.19 Exercises 

In Exercises 1 through 5, a particle is assumed to be moving in simple harmonic motion, accord- 
ing to the equation y = C sin ( kx + a). The velocity of the particle is defined to be the derivative 
y\ Thefrequency of the motion is the reciprocal of the period. (Period = 2? rjk; frequency = kl2v.) 
The frequency represents the number of cycles completed in unit time, provided k > 0. 

1 . Find the amplitude C if the frequency is 1 / n and if the initial values of y and y’ (when x = 0) 
are 2 and 4, respectively. 

2. Find the velocity when y is zero, given that the amplitude is 7 and the frequency is 10. 

3. Show that the equation of motion can also be written as follows: 

y = A COS (mx + fS) . 

Find equations that relate the constants A, m, fS t and C,k, a. 

4. Find the equation of motion given that y = 3 and y’ = 0 when x = 0 and that the period is 

5. Find the amplitude of the motion if the period is 2-n and the velocity is ±r 0 when y = y„ . 

6. A particle undergoes simple harmonic motion. Initially its displacement is 1. its velocity is 2 
and its acceleration is — 12. Compute its displacement and acceleration when the velocity is \/8. 

7. For a certain positive number k, the differential equation of simple harmonic motion y” + 
k 2 y = 0 has solutions of the form y = f(x) with f(0) = f(3) = 0 and f(x) < 0 for all x in 
the open interval 0 < x < 3. Compute k and find all solutions. 

8. The current Z(t) at time t flowing in an electric circuit obeys the differential equation /"(f) + 
Z(t) =G( t), where G is a step function given by G(t) = 1 if 0 < t < 2tt, G(t) = 0 for all other t. 
Determine the solution which satisfies the initial conditions Z(0) = 0, Z’(0) = 1. 

9. The current Z( t) at time t flowing in an electric circuit obeys the differential equation 

Z”(t) + Rl'(t) + Z(t) = sin cuf , 

where R and co are positive constants. The solution can be expressed in the form Z(t) = 
F(t) + A sin (cot + a), where F(t) -► 0 as t -* + co, and A and a are constants depending on 
R and co, with A > 0. If there is a value of w which makes A as large as possible, then co/(27r) 
is called a resonance frequency of the circuit. 

(a) Find all resonance frequencies when R = 1. 

(b) Find those values of R for which the circuit will have a resonance frequency. 

10. A spaceship is returning to earth. Assume that the only external force acting on it is the 
action of gravity, and that it falls along a straight line toward the center of the earth. The 
effect of gravity is partly overcome by firing a rocket directly downward. The rocket fuel is 
consumed at a constant rate of k pounds per second and the exhaust material has a constant 
speed of c feet per second relative to the rocket. Find a formula for the distance the spaceship 
falls in time t if it starts from rest at time t = 0 with an initial weight of w pounds. 

11. A rocket of initial weight w pounds starts from rest in free space (no external forces) and 
moves along a straight line. The fuel is consumed at a constant rate of k pounds per second 
and the products of combustion are discharged directly backward at a constant speed of c 
feet per second relative to the rocket. Find the distance traveled at time t. 

12. Solve Exercise 1 1 if the initial speed of the rocket is r 0 and if the products of combustion are 
fired at such a speed that the discharged material remains at rest in space. 


8.20 Remarks concerning nonlinear differential equations 

Since second-order linear differential equations with constant coefficients occur in such 
a wide variety of scientific problems, it is indeed fortunate that we have systematic methods 
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for solving these equations. Many nonlinear equations also arise naturally from both 
physical and geometrical problems, but there is no comprehensive theory comparable to 
that for linear equations. In the introduction to this chapter we mentioned a classic “bag 
of tricks” that has been developed for treating many Special cases of nonlinear equations. 
We conclude this chapter with a discussion of some of these tricks and some of the problems 
they help to solve. We shall consider only first-order equations which can be solved for 
the derivative y’ and expressed in the form 

(8.38) / = f(x,y) . 

We recall that a solution of (8.38) on an interval Z is any function, say y = Y(x), which 
is differentiable on I and satisfies the relation Y’(x) =/[x, Y(x)] for all x in Z. In the linear 
case, we proved an existence-uniqueness theorem which tells us that one and only one 
solution exists satisfying a prescribed initial condition. Moreover, we have an explicit 
formula for determining this solution. 

This is not typical of the general case. A nonlinear equation may have no solution 
satisfying a given initial condition, or it may have more than one. For example, the equation 
( y ') 2 — xy' + y + 1 = 0 has no solution with y = 0 when x = 0, since this would require 
that (y') 2 = — 1 when x = 0. On the other hand, the equation y’ = 3 y 213 has two distinct 
solutions, Y,(x) = 0 and Yfx) = x 3 , satisfying the initial condition y = 0 when x = 0. 

Thus, the study of nonlinear equations is more difficult because of the possible non- 
existence or nonuniqueness of solutions. Also, even when solutions exist, it may not be 
possible to determine them explicitly in terms of familiar functions. Sometimes we can 
eliminate the derivative y’ from the differential equation and arrive at a relation of the form 

F(x, y) = 0 

satisfied by some, or perhaps all, solutions. If this equation can be solved for y in terms 
of x, we get an explicit formula for the solution. More often than not, however, the 
equation is too complicated to solve for y. For example, in a later section we shall study 
the differential equation 


and we shall find that every solution necessarily satisfies the relation 


(8.39) 


1 V 

- log (x” + y") + arctan - + C = 0 

2 x 


for some constant C. It would be hopeless to try to solve this equation for y in terms of x. 
In a case like this, we say that the relation (8.39) is an implicit formula for the solutions. It 
is common practice to say that the differential equation has been “solved” or “integrated” 
when we arrive at an implicit formula such as F(x, y) = 0 in which no derivatives of the 
unknown function appear. Sometimes this formula reveals useful information about the 
solutions. On the other hand, the reader should realize that such an implicit relation may 
be less helpful than the differential equation itself for studying properties of the solutions. 
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In the next section we show how qualitative information about the solutions can often 
be obtained directly from the differential equation without a knowledge of explicit or 
implicit formulas for the solutions. 


8.21 Integral curves and direction fields 

Consider a differential equation of first order, say y’ = f(x, y), and suppose some of the 
solutions satisfy an implicit relation of the form 

(8.40) F(x, y , C) = 0 , 

where C denotes a constant. If we introduce a rectangular coordinate system and plot all 
the points (x, y) whose coordinates satisfy (8.40) for a particular C, we obtain a curve called 
an integral curve of the differential equation. Different values of C usually give different 
integral curves, but all of them share a common geometric property. The differential 
equation y’ = J[x, y) relates the si ope y’ at each point (x, y) of the curve to the coordinates 
x and y. As C takes on all its values, the collection of integral curves obtained is called a 
one-parameter family of curves. 

For example, when the differential equation is y’ = 3, integration gives us y = 3x + C, 
and the integral curves form a family of straight lines, all having slope 3. The arbitrary 
constant C represents the y-intercept of these lines, 

If the differential equation is y’ = x, integration yields y = |.\' 2 + C, and the integral 
curves form a family of parabolas as shown in Figure 8.7. Again, the constant C tells us 
where the various curves cross the y-axis. Figure 8.8 illustrates the family of exponential 
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curves, y ;= Ce”, which are integral curves of the differential equation y’ = y. Once 
C represents the y-intercept. In this case, C is also equal to the slope of the curve 
point where it crosses the y-axis. 

A family of nonparallel straight lines is shown in Figure 8.9. These are integral 
of the differential equation 


(8.41) 


y = X d ^- l -M, 

dx A\dxl 


more, 
at the 

curves 


F Y 



Figure 8.9 Integral curves of the differential 

/ 

equation y — 


_ dy 1 /rM 2 

X dx 4 \dx J 


Figure 8.10 A solution of Equation 
(8.41) that is not a member of the 
family in Equation (8.42). 


and a one-parameter family of solutions is given by 

(8.42) y = cx - iC 2 . 

This family is one which possesses an envelope, that is, a curve having the property that 
at each of its points it is tangent to one of the members of the family .f The envelope here 
is y = x 2 and its graph is indicated by the dotted curve in Figure 8.9. The envelope of a 
family of integral curves is itself an integral curve because the slope and coordinates at a 
point of the envelope are the same as those of one of the integral curves of the family. In 
this example, it is easy to verify directly that y = x 2 is a solution of (8.41). Note that this 
particular solution is not a member of the family in (8.42). Further solutions, not members 
of the family, may be obtained by piecing together members of the family with portions 
of the envelope. An example is shown in Figure 8.10. The tangent line at A cornes from 
taking C = -2 in (8.42) and the tangent at B comes from C = |. The resulting solution, 
y = f{x), is given as follows: 

f- 2 x - 1 if .v < — 1 , 
f(x) = I* 2 if —1 < X < j , 

U* - r«- if * > i ■ 


t And conversely, each member of the family is tangent to the envelope. 
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This function has a derivative and satisfies the differential equation in (8.41) for every 
real x. It is clear that an infinite number of similar examples could be constructed in the 
Same way. This example shows that it may not be easy to exhibit all possible solutions of 
a differential equation. 

Sometimes it is possible to find a first-order differential equation satisfied by all members 
of a one-parameter family of curves. We illustrate with two examples. 

example 1. Find a first-order differential equation satisfied by all circles with center 
at the origin. 

Solution, A circle with center at the origin and radius C satisfies the equation 
x 2 + y 2 = C 2 . As C varies over all positive numbers, we obtain every circle with center 
at the origin. T) find a first-order differential equation having these circles as integral 
curves, we simply differentiate the Cartesian equation to obtain 2x + 2 yy' = 0. Thus, 
each circle satisfies the differential equation y’ = —x/y. 

example 2. Find a first-order differential equation for the family of all circles passing 
through the origin and having their centers on the x-axis. 

Solution, If the center of a circle is at (C, 0) and if it passes through the origin, the 
theorem of Pythagoras tells us that each point (x, y) on the circle satisfies the Cartesian 
equation (x — C) 2 + y 2 = C 2 , which can be written as 

(8.43) x 2 +y 2 - 2 Cx = 0 . 

To find a differential equation having these circles as integral curves, we differentiate (8.43) 
to obtain 2x + 2 yy' — 2C ~ 0, or 

(8.44) x + yy’= c. 

Since this equation contains c, it is satisfied only by that circle in (8.43) corresponding to 
the same C. Tt> obtain one differential equation satisfied by all the curves in (8.43), we 
must eliminate C. We could differentiate (8.44) to obtain 1 + yy" + (y'fi = 0. This is a 
second-order differential equation satisfied by all the curves in (8.43). We can obtain a 
first-order equation by eliminating C algebraically from (8.43) and (8.44). Substituting 
x + yy' for C in (8.43), we obtain x 2 + y 2 — 2x(x + yy'), a first-order equation which 
can be solved for y’ and written as y’ = (y 2 — x 2 )/(2 xy). 

Figure 8.11 illustrates what is called a direction field of a differential equation. This is 
simply a collection of short line segments drawn tangent to the various integral curves. 
The particular example shown in Figure 8.11 is a direction field of the equation y’ = y. 

A direction field can be constructed without solving the differential equation. Choose 
a point, say (a, b), and compute the number f(a, b ) obtained by substituting in the righthand 
side of the differential equation y’ = fi(x, y). If there is an integral curve through this point, 
its slope there must be equal to f(a, b). Therefore, if we draw a short line segment through 
(a, b) having this slope, it will be part of a direction field of the differential equation. By 
drawing several of these line segments, we can get a fair idea of the general behavior of the 
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Figure 8.11 A direction field for the differential equation y ’ = y. 

integral curves. Sometimes such qualitative information about the solution may be all 
that is needed. Notice that different points (0, b ) on the y-axis yield different integral 
curves. This gives us a geometric reason for expecting an arbitrary constant to appear 
when we integrate a first-order equation. 


8.22 Exercises 


In Exercises 1 through 12, find a first-order differential equation having the given family of 
curves as integral curves. 

1. 2x + 3y = C. 6. x 2 + / + 2 Cy = 1. 


2. y = Ce~ 2x . 

3. x 2 - / = c. 


7. y = C(x — l)e x , 

8. f(x + 2) = C(x - 2). 


4. xy = c. 9. y = ccosx. 

5. y 2 = cx. 10. arctan y + arcsin x = C. 

11. All circles through the points (1, 0) and ( —1,0). 

12. All circles through the points (1,1) and ( — 1, — 1). 


In the construction of a direction field of a differential equation, sometimes the work maybe 

speeded considerably if we first locate those points at which the slope y’ has a constant value C. 

For each C, these points lie on a curve called an isocline. 

13. Plot the isoclines corresponding to the constant slopes 1. and 2 for the differential equation 
y ’ = x 2 + y 1 - With the aid of the isoclines, construct a direction field for the equation and try 
to determine the shape of the integral curve passing through the origin. 

14. Show that the isoclines of the differential equation y’ = x + y form a one-parameter family 
of straight lines. Plot the isoclines corresponding to the constant slopes 0, ±f, ±1,±§,±2. 
With the aid of the isoclines, construct a direction field and sketch the integral curve passing 
through the origin. One of the integral curves is also an isocline; find this curve. 
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15. Plot a number of isoclines and construct a direction field for the equation 



If you draw the direction field carefully, you should be able to determine a one-parameter 
family of solutions of this equation from the appearance of the direction field. 


8.23 First-order separable equations 

A first-order differential equation of the form y’ = f(x, y) in which the right member 
f (x, y) splits into a product of two factors, one depending on x alone and the other depending 
on y alone, is said to be a separable equation. Examples are y’ = X s , y' = y, y’ = sin y log x, 
y’ = x/tan y, etc. Thus each separable equation can be expressed in the form 

r’ = Q(x)R(y) , 

where Q and R are given functions. When R(y) ^ 0, we can divide by R(y) and rewrite 
this differential equation in the form 


Ay)y' = Q(x) , 

where A(y) = 1 /R(y). The next theorem tells us how to find an implicit formula satisfied 
by every solution of such an equation. 

theorem 8.10. Let y = Y(x) be any solution of the separable differential equation 

(8.45) A(y)y' = Q{x) 

such that Y’ is continuous on an open interval I. Assume that both Q and the composite 
function A o Y are continuous on J. Let G be any primitive of A. that is, any function such 
that G’ ^ A. Then the solution Y satisfies the implicit formula 

(8.46) G(y) = | Q(x) dx + C 

for some constant C. Conversely, if y satisfies (8.46) then y is a solution of (8.45). 

Proof. Since Y is a solution of (8.49, we must have 

(8.47) A[Y(x)] Y'(x) = Q(x) 
for each x in I. Since G’ = A, this equation becomes 

G’[ Y(x) ] Y’(x) = Q(x) . 

But, by the chain rule, the left member is the derivative of the composite function G » Y. 




346 


Introduction to differential equations 


Therefore G o Y is a primitive of Q, which means that 

(8.48) G[Y(x)] = | Q(x) dx + C 

for some constant C. This is the relation (8.46). Conversely, if y = Y(x) satisfies ( 8 . 46 ), 
differentiation gives us ( 8 . 47 ), which shows that Y is a solution of the differential equation 

(8.45). 

Note: The implicit formula (8.46) can also be expressed in terms of A. From (8.47) 

we have 

\A[ Y(x) 1 y'(x) dx =| Q(x) dx + C . 


If we make the substitution y = Y(x), dy = Y’(x) dx in the integral on the left, the 
equation becomes 

(8.49) jA(y)dy=jQ(pc) dx + C . 

Since the indefinite integral J A(y) dy represents any primitive of A, Equation (8.49) is 
an alternative way of writing (8.46). 

In practice, formula (8.49) is obtained directly from (8.45) by a mechanical process. In 
the differential equation (8.45) we write dyjdx for the derivative y’ and then treat dy/dx as 
a fraction to obtain the relation A(y) dy = Q(x) dx. Now we simply attach integral signs 
to both sides of this equation and add the constant C to obtain (8.49). The justification for 
this mechanical process is provided by Theorem 8.10. This process is another example 
illustrating the effectiveness of the Leibniz notation. 


example . The nonlinear equation xy’ + y = y 2 is separable since it can be written in 
the form 


(8.50) 


Y’ = L 

y(y - 1) x ’ 


provided that y(y — 1) 5 ^ 0 and x ^0. Now the two constant functions y = 0 and y = 1 
are clearly solutions of xy’ + y — y 2 . The remaining solutions, if any exist, satisfy (8.50) 
and, hence, by Theorem 8.10 they also satisfy 


/ 


dy 

y(y - i) 



for some constant K. Since the integrand on the left is l/(y — 1) — 1 jy. when we integrate, 
we find that 

log I.V - 1 1 - log \y\ = log |x| + K . 

This gives us |(_y — 1 )/_>-’ = x e K or (y — 1 )/y = Cx for some constant C. Solving for y, 
we obtain the explicit formula 

1 

1 - Cx' 


(8.51) 
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Theorem 8.10 tells us that for any choice of C this y is a solution; therefore, in this example 
we have determined all solutions: the constant functions y = 0 and y = 1 and all the 
functions defined by (8.51). Note that the choice C = 0 gives the constant solution y = 1. 


8.24 Exercises 


In Exercises 1 through 12, assume solutions exist and find an implicit formula satisfied by the 
solutions. 


1. y’ = x 3 If. 

2. tan x COS y = -y’ tan y. 

3. (x + 1 )y' -j- y 2 = 0. 

4 . y* = (y - !)(>■ - 2). 

5. yVl - x 2 / = X. 

6. (x — 1 )y' = xy. 


7. (1 - x 2 ) 1/2 y' + 1 + / = 0. 

8. xy( 1+ x 2 )/- (1 + /) = 0. 

9. (x 2 ~ 4)y' = y. 

10. xyy’ = 1 + x 2 + y 2 + x 2 y 2 . 

11. yy' = e x+2v sin x. 

12. X dx + y dy = xy(x dy - y dx). 


In Exercises 13 through 16, find functions/, continuous on the whole real axis, which satisfy the 
conditions given. When it is easy to enumerate all of them, do so; in any case, find as many as 
you can. 

13. f(x) = 2+ f?/(0 dt. 

14. /(x)/(x) = 5x, ffOJ = 1. 

15. f (x) + 2xe'<*> = 0, f(0) = 0. 

16. / 2 (x) + [/'(x)] 2 = 1. Note: fix) = -1 is one solution. 

17. A nonnegative function/, continuous on the whole real axis, has the property that its ordinate 
set over an arbitrary interval has an area proportional to the length of the interval. Find/. 

18. Solve Exercise 17 if the area is proportional to the difference of the function values at the end- 
points of the interval. 

19. Solve Exercise 18 when “difference” is replaced by “sum,” 

20. Solve Exercise 18 when “difference” is replaced by “product.” 


8.25 Homogeneous first-order equations 

We consider now a special kind of first-order equation, 

(8.52) Y ’ = f{x,y), 

in which the right-hand side has a special property known as homogeneity. This means that 

(8.53) f{tx, ty) - f(x, Y) 


for all X, y, and all t ^ 0. In other words, replacement of x by tx and y by ty has no effect 
on the value of f(x, y). Equations of the form (8.52) which have this property are called 
homogeneous (sometimes called homogeneous of degree zero). Examples are the following; 


y' = 


y — x 

y + x ’ 




* 2 + f 


y' = log x - log y . 


If we use (8.53) with t = 1/x, the differential equation in (8.52) becomes 
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The appearance of the quotient y/x on the right suggests that we introduce a new unknown 
function V where v = yjx. Then y = vx, y’ = v’x + v, and this substitution transforms 
(8.54) into 

v’x + v =/(l, v) or x—=f( l,v) — V. 

dx 

This last equation is a first-order separable equation for v. We may use Theorem 8.10 to 
obtain an implicit formula for v and then replace v by y/x to obtain an implicit formula 
for y. 

example . Solve the differential equation y’ = (y — x)l(y + x). 

Solution. We rewrite the equation as follows: 


y ' " I* ~ 1 

yl x + 1 

The substitution v = yjx transforms this into 

du _ v - 1 _ 1 + v 2 

dx u+1 v + 1 ’ 

Applying Theorem 8.10, we get 


Integration yields 


— - — Jv + —^—dv = 
J 1 + u 2 J 1 + v 2 


f + c 

J X 


i log (1 + v 2 ) + arctan v = -log |x;| + C 
Replacing v by y/x, we have 


5 log (x 2 + y 2 ) — 4 log x 2 + arctan - = —log |x| + C , 


and since log x 2 = 2 log |x|, this simplifies to 


l log (x 2 + y 2 ) + arctan ^ = C . 

X 

There are some interesting geometric properties possessed by the solutions of a homo- 
geneous equation y’ = f(x, y). First of all, it is easy to show that straight lines through the 
origin are isoclines of the equation. We recall that an isocline of y’ =/(x, y) is a curve 
along which the slope y’ is constant. This property is illustrated in Figure 8.12 which 
shows a direction field of the differential equation y’ = —2 yjx. The isocline corresponding 
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to slope c has the equation —2 y/x = c, or y = —\cx and is therefore a line of slope —\c 
through the origin. To prove the property in general, consider a line of slope m through 
the origin. Then y = mx for all (x, y) on this line; in particular, the point (1, m) is on the 
line. Suppose now, for the sake of simplicity, that there is an integral curve through each 
point of the line y = mx. The slope of the integral curve through a point (a, b ) on this 
line is f(a,b) = fi(a, ma). If a ^ 0, we may use the homogeneity property in (8.53) to 


I 



Figure 8.12 A direction field for the differential equation y’ = — 2 .y/x. The isoclines 
are straight lines through the origin. 

write f(a,ma) =/( 1, m). In other words, if (a, b ) ^ (0, 0), the integral curve through 
(a, b) has the same slope ’as the integral curve through (1, m). Therefore the line y = mx 
is an isocline, as asserted. (It can also be shown that these are the only isoclines of a 
homogeneous equation.) 

This property of the isoclines suggests a property of the integral curves known as 
invariance under similarity transformations. We recall that a similarity transformation 
carries a set S into a new set kS obtained by multiplying the coordinates of each point 
of S by a constant factor k > 0. Every line through the origin remains fixed under a 
similarity transformation. Therefore, the isoclines of a homogeneous equation do not 
change under a similarity transformation; hence the appearance of the direction field 
does not change either. This suggests that similarity transformations carry integral curves 
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into integral curves. To prove this analytically, let us assume that S is an integral curve 
described by an explicit formula of the form 

(8.55) y = F(x) . 

To say that 5 is an integral curve of y’ = fix, y) means that we have 

(8.56) F'(x ) = f(x, F(x)) 

for all x under consideration. Now choose any point (x, y) on kS. Then the point ( xjk , yjk) 
lies on S and hence its coordinates satisfy (8.55), so we have yjk - F(x/k) or y = kF(x/k). 
In other words, the curve kS is described by the equation y = G(x), where G(x) = kF(xjk). 
Note that the derivative of G is given by 

GW = kF ii)-r- F iii- 

To prove that kS is an integral curve of y’ = fix, y) it will suffice to show that G’(x) = 
f(x, G(x)) or, what is the same thing, that 

(8 - 57) 

But if we replace x by xjk in Equation (8.56) and then use the homogeneity property with 
t = k, we obtain 



and this proves (8.57). In other words, we have shown that kS is an integral curve whenever 
Sis. A simple example in which this geometric property is quite obvious is the homogeneous 
equation y’ = —xjy whose integral curves form a one-parameter family of concentric 
circles given by the equation x 2 + y 2 — C. 

It can also be shown that if the integral curves of a first-order equation y’ = f (x, y) are 
invariant under similarity transformations, then the differential equation is necessarily 
homogeneous. 


8.26 Exercises 

1. Show that the substitution y = x/v transforms a homogeneous equation y ’ = / (x, y) into a 
first-order equation for v which is separable. Sometimes this substitution leads to integrals 
that are easier to evaluate than those obtained by the substitution y = xv discussed in the text. 
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6. xy’ = y — V x 2 + y 2 . 

7. x 2 y' + xy + 2 y 2 = 0. 

8. y 2 + (x 2 — xy + y 2 )y' = 0. 


_ y(x 2 + xy + /) 

' y X(X 2 + 3xy + f) ' 

10. y’ = - + sin - . 

x x 

11. x(y + 4x)y’ + y(x + 4y) = 0. 


8.27 Some geometrical and physical problems leading to first-order equations 

We discuss next some examples of geometrical and physical problems that lead to 
first-order differential equations that are either separable or homogeneous. 

Orthogonal trajectories. Two curves are said to intersect orthogonally at a point if their 
tangent lines are perpendicular at that point. A curve which intersects every member of a 
family of curves orthogonally is called an orthogonal trajectory for the family. Figure 8.13 
shows some examples. Problems involving orthogonal trajectories are of importance in 
both pure and applied mathematics. For example, in the theory of fluid flow, two orthogonal 
families of curves are called the equipotential lines and the stream lines, respectively. In the 
theory of heat, they are known as isothermal lines and lines of flow. 

Suppose a given family of curves satisfies a first-order differential equation, say 

(8.58) V = f(x,y), 

The number f (x, y) is the slope of an integral curve passing through (x, y). The slope of 
each orthogonal trajectory through this point is the negative reciprocal — l If (x, y), so the 
orthogonal trajectories satisfy the differential equation 


(8.59) 


/ 


1 

fix, y) ' 


If (8.58) is separable, then (8.59) is also separable. If (8.58) is homogeneous, then (8.59) is 
also homogeneous. 


example 1. Find the orthogonal trajectories of the family of all circles through the origin 
with their centers on the x-axis. 


Solution. In Example 2 of Section 8.21 we found that this family is given by the 
Cartesian equation x 2 + y 2 — 2 Cx = 0 and that it satisfies the differential equation 
y’ = (y 2 — x 2 )l(2xy). Replacing the right member by its negative reciprocal, we find that 
the orthogonal trajectories satisfy the differential equation 


/ - 


2 xy 


This homogeneous equation may be integrated by the substitution y = vx, and it leads to 
the family of integral curves 


x 2 + y 2 — 2cy = 0 . 
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This is a family of circles passing through the origin and having their centers on the y-axis. 
Examples are shown in Figure 8.13. 

Pursuit problems. A point Q is constrained to move along a prescribed plane curve C l . 
Another point P in the same plane “pursues” the point Q. That is, P moves in such a 
manner that its direction of motion is always toward Q. The point P thereby traces out 
another curve C 2 called a curve of pursuit. An example is shown in Figure 8.14 where C 1 is 



Figure 8.13 Orthogonal circles. 


Figure 8.14 The tractrix as 
a curve of pursuit. The dis- 
tance from P to Q is constant. 


the y-axis. In a typical problem of pursuit we seek to determine the curve C 2 when the 
curve Cj is known and some additional piece of information is given concerning P and Q, 
for example, a relation between their positions or their velocities. 

When we say that the direction of motion of P is always toward Q, we mean that the 
tangent line of C 2 through P passes through Q. Therefore, if we denote by (x, y) the 
rectangular coordinates of P at a given instant, and by (X, Y) those of Q at the same 
instant, we must have 


(8.60) 



The additional piece of information usually enables us to consider X and Y as known 
functions of x and y, in which case Equation (8.60) becomes a first-order differential 
equation for y. Now we consider a specific example in which this equation is separable. 
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example 2. A point Q moves on a straight line C x , and a point P pursues Q in such a 
way that the distance from P to Q has a constant value k > 0. If P is initially not on C x , 
find the curve of pursuit. 

Solution. We take C 1 to be the y-axis and place P initially at the point (k, 0). Since 
the distance from P to Q is k, we must have (X — x ) 2 + (Y — y) 2 = k 2 . Eut X = 0 on 
Ci , so we have Y — y = Vk 2 — X 2 , and the differential equation (8.60) becomes 

, Vk 2 - X 2 

y = — 


Integrating this equation with the help of the substitution x = k cos t and using the fact 
that y = 0 when x = k, we obtain the relation 


y = k log 


k + Vk 2 - x 2 

X 


_ Vk 2 - X 2 • 


The curve of pursuit in this example is called a tractrlx; it is shown in Figure 8.14. 

Flow of fluid through an orijice Suppose we are given a tank (not necessarily cylindrical) 
containing a fluid. The fluid flows from the tank through a sharp-edged orifice. If there 
were no friction (and hence no loss of energy) the speed of the jet would be equal to V 2 gy 
feet per second, where y denotes the height (in feet) of the surface above the orifice. 7 (See 
Figure 8.15.) If A 0 denotes the area (in square feet) of the orifice, then A 0 V2gy represents 
the number of cubic feet per second of fluid flowing from the orifice. Because of friction, 
the jet stream contracts somewhat and the actual rate of discharge is more nearly cA 0 V 2 gy, 
where c is an experimentally determined number called the discharge coefficient. For 
ordinary sharp-edged orifices, the approximate value of c is 0.60. Using this and taking 
g = 32, we find that the speed of the jet is 4.8 Vy feet per second, and therefore the rate of 

discharge of volume is 4.8 A 0 Vy cubic feet per second. 

Let V(y) denote the volume of the fluid in the tank when the height of the fluid is y. If 
the cross-sectional area of the tank at the height U is A(u), then we have V(y) = Jg A(u) du, 
from which we obtain dVjdy = A(y). The argument in the foregoing paragraph tells us 
that the rate of change of volume with respect to time is dVjdt = — 4.8A 0 Vy cubic feet per 
second, the minus sign coming in because the volume is decreasing. By the chain rule we 
have 

<W_ _ dV_ dy _ A( dy 
dt dy dt dt ' 

Combining this with the equation dVjdt = —4.8 A 0 Vy, we obtain the differential equation 


A( y)%= -4M,Vy- 


j If a particle of mass m falls freely through a distance y and reaches a speed v, its kinetic energy Jrm’ 2 
must be equal to the potential energy mgy (the work done in lifting it up a distance y). Solving for r, we 
get v = V 2 gy- 
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This separable differential equation is used as the mathematical model for problems 
concerning fluid flow through an orifice. The height y of the surface is related to the time 
t by an equation of the form 

(8.61) J ^ dy = -4 .&A 0 jdt + C . 



Figure 8.15 Flow of fluid through an orifice. 


example 3. Consider a specific case in which the cross-sectional area of the tank is 
constant, say A(y) = A for all y, and suppose the level of the fluid is lowered from 10 feet 
to 9 feet in 10 minutes (600 seconds). These data can be combined with Equation (8.61) 
to give us 

'600 

dt , 


_r^ =fc r 

J io y Jo 


where k = 4.8 A 0 jA. Using this, we can determine k and we find that 


VlO - V 9 


600k 


or k 


VlO - 

300 


Now we can compute the time required for the level to fall from one given value to any 
other. For example, if at time t l the level is 7 feet and at time t 2 it is 1 foot (t u t 2 measured 
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in minutes, say), then we must have 


Piz 

Jl y/y 



dt , 


which yields 


h 


2(Vl - 1) _ 10 Vl - 1 = 10(V7 - lXVlO + 3) 
6 Ok VTO -3 10-9 


(10)(1.645X6.162) 


= 101.3 min. 


8.28 Miscellaneous review exercises 


In each of Exercises 1 through 10 find the orthogonal trajectories of the given family of curves. 

1. 2x +3 y = c. 5. x 2 y = c. 

2. xy = c. 6. y = Ce~' 2x . 


3. x 2 + / + 2 Cy = 1. 7. x 2 - f = c. 

4. y 1 = Cx. 8. y = Ccosx. 

9. All circles through the points (1, 0) and (— 1,0). 

10. All circles through the points (1,1) and ( — 1, — 1). 

1 1 . A point Q moves upward along the positive y-axis. A point P, initially at (1, 0), pursues Q 
in such a way that its distance from the y-axis is \ the distance of Q from the origin. Find a 
Cartesian equation for the path of pursuit. 

12. Solve Exercise 1 1 when the fraction \ is replaced by an arbitrary positive number k. 

13. A curve with Cartesian equation y =f(x ) passes through the origin. Lines drawn parallel 
to the coordinate axes through an arbitrary point of the curve form a rectangle with two sides 
on the axes. The curve divides every such rectangle into two regions A and B, one of which 
has an area equal to n times the other. Find the function f. 

14. Solve Exercise 13 if the two regions A and B have the property that, when rotated about the 
x-axis, they sweep out solids one of which has a volume n times that of the other. 

1.5. The graph of a nonnegative differentiable function f passes through the origin and through 
the point (1, 2/ir). If, for every x > 0, the ordinate set off above the interval [0, x] sweeps 
out a solid of volume x 2 f(x) when rotated about the x-axis, find the function f , 

16. A nonnegative differentiable function f is defined on the closed interval [0, 1] with I'd j = 0. 
For each a, 0 < a < 1, the line x = a cuts the ordinate set off into two regions having areas 
A and B, respectively, A being the area of the leftmost region. If A — B = 2f(a) +3 a+ b, 
where b is a constant independent of a, find the function f and the constant b. 

17. The graph of a functionfpasses through the two points P 0 = (0, 1) and P x = (1, 0). For every 
point P = (x, y) on the graph, the curve lies above the chord P 0 P, and the area Apr) of the 
region between the curve and the chord PP,, is equal to x 3 , Determine the function/ 

18. A tank with vertical sides has a square cross-section of area 4 square feet. Water is leaving the 
tank through an orifice of area 5/3 square inches. If the water level is initially 2 feet above 
the orifice, find the time required for the level to drop 1 foot. 

19. Refer to the preceding problem. If water also flows into the tank at the rate of 100 cubic inches 
per second, show that the water level approaches the value (25/24) 2 feet above the orifice, 
regardless of the initial water level. 

20. A tank has the shape of a right circular cone with its vertex up. Find the time required to 
empty a liquid from the tank through an orifice in its base. Express your result in terms of the 
dimensions of the cone and the areaA, of the orifice. 




356 


Introduction to differential equations 


21. The equation xy” — y’ + (1 — x)y = 0 possesses a solution of the form y = e mx , where m 
is constant. Determine this solution explicitly. 

22. Solve the differential equation (x + y 3 ) + 6 xy 2 y' = 0 by making a suitable change of variable 
which converts it into a linear equation. 

23. Solve the differential equation (1 + y 2 e 2x )y' + y = 0 by introducing a change of variable of 
the form y = ue mx , where m is constant and u is a new unknown function. 

24. (a) Given a function f which satisfies the relations 

2f'(x) =/ ; if x > 0, /( 1) = 2 , 

0 X 

let y = f(x) and show that y satisfies a differential equation of the form 

x 2 y" + axy’ + by = 0 , 


where a and b are constants. Determine a and b, 

(b) Find a solution of the form f(x) = Cx n . 

25. (a) Let u be a nonzero solution of the second-order equation 

y” + P(x)/ + Q(x)y = 0 . 

Show that the substitution y = uv converts the equation 

y” + P(x)y‘ + Q(x)y = R( x ) 
into a first-order linear equation for v’. 

(b) Obtain a nonzero solution of the equation y” — Ay’ + x\y' — 4 y) = 0 by inspection 
and use the method of part (a) to find a solution of 

/' - 4/ + x\y' - Ay) = 2xe^ % 


such that y = 0 and y’ = 4 when x = 0. 

26. Scientists at the Ajax Atomics Works isolated one gram of a new radioactive element called 
Deteriorum. It was found to decay at a rate proportional to the square of the amount present. 
After one year, \ gram remained. 

(a) Set up and solve the differential equation for the mass of Deteriorum remaining at time t. 

(b) Evaluate the decay constant in units of gntr 1 yr^ 1 . 

27. In the preceding problem, suppose the word square were replaced by square root, the other 
data remaining the same. Show that in this case the substance would decay entirely within 
a finite time, and find this time. 

28. At the beginning of the Gold Rush, the population of Coyote Gulch, Arizona was 365. From 
then on, the population would have grown by a factor of e each year, except for the high rate 
of “accidental” death, amounting to one victim per day among every 100 citizens. By solving 
an appropriate differential equation determine, as functions of time, (a) the actual population of 
Coyote Gulch t years from the day the Gold Rush began, and (b) the cumulative number of 
fatalities. 

29. With what speed should a rocket be fired upward so that it never returns to earth? (Neglect 
all forces except the earth's gravitational attraction.) 
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30. Let y = f(x)be that solution of the differential equation 

2 v 2 + x 

which satisfies the initial condition/(0) = 0. (Do not attempt to solve this differential equation.) 

(a) The differential equation shows thatf’ (0) = 0. Discuss whetherf has a relative maximum 
or minimum or neither at 0. 

(b) Notice that f (x) > 0 for each x > 0 and that f (x) > | for each x > Exhibit 

two positive numbers a and b such that f(x) > ax — b for each x > 1 3 °. 

(c) Show that x/y 2 -* 0 as x -> +co. Give full details of your reasoning. 

(d) Show that y/x tends to a finite limit as x -> + oo and determine this limit. 

3 1. Given a function / which satisfies the differential equation 

xf"(x) + 3 x[f'(x)f = 1 - e~ x 

for all real x. (Do not attempt to solve this differential equation.) 

(a) Iffhas an extremum at a point c ^ 0, show that this extremum is a minimum. 

(b) Iffhas an extremum at 0, is it a maximum or a minimum? Justify your conclusion. 

(c) If/(0) = /'(0) = 0, find the smallest constant A such that /(A) < Ax 2 for allx > 0. 
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COMPLEX NUMBERS 


9.1 Historical introduction 

The quadratic equation x 2 + 1 = 0 has no solution in the real-number system because 
there is no real number whose square is — 1. New types of numbers, called complex numbers, 
have been introduced to provide solutions to such equations. In this brief chapter we 
discuss complex numbers and show that they are important in solving algebraic equations 
and that they have an impact on differential and integral calculus. 

As early as the 16th Century, a symbol V— 1 was introduced to provide solutions of the 
quadratic equation x 2 + 1=0. This symbol, later denoted by the letter /, was regarded 
as a fictitious or imaginary number which could be manipulated algebraically like an 
ordinary real number, except that its square was — 1. Thus, for example, the quadratic 
polynomial x 2 + 1 was factored by writing x 2 + 1 = X 2 — i 2 = (x — i)(x + i), and the 
solutions of the equation x 2 + 1 = 0 were exhibited as x = fi, without any concern 
regarding the meaning or validity of such formulas. Expressions such as 2 + 3i were 
called complex numbers, and they were used in a purely formal way for nearly 300 years 
before they were described in a manner that would be considered satisfactory by present-day 
standards. 

Early in the 19th Century, Karl Friedrich Gauss (1777-1855) and William Rowan 
Hamilton (1805-1865) independently and almost simultaneously proposed the idea of 
defining complex numbers as ordered pairs (a, b ) of real numbers endowed with certain 
special properties. This idea is widely accepted today and is described in the next section. 


9.2 Definitions and field properties 


DEFINITION . If a and b are real numbers, the pair (a, b) is called a complex number, 
provided that equality, addition, and multiplication of pairs is defined as follows: 

(a) Equality: (a, b) = (c, d) means a = c and b = d. 

(b) Sum: (a, b) + (c, d) = (a + c, b + d). 

(c) Product: (a, b)(c, d) = (ac — bd, ad + be). 


The definition of equality tells us that the pair (a, b) is to be regarded as an ordered pair. 
Thus, the complex number (2, 3) is not equal to the complex number (3, 2). The numbers 
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a and b are called components of (a, b). The first component, a, is also called the reaipart 
of the complex number ; the second component, b, is called the imaginary part. 

Note that the symbol i = V" — 1 does not appear anywhere in this definition. Presently 
we shall introduce i as a particular complex number which has all the algebraic properties 
ascribed to the fictitious symbol V— 1 by the early mathematicians. However, before we 
do this, we will discuss the basic properties of the operations just defined. 

theorem 9.1. The operations of addition and multiplication of complex numbers satisfy 
the commutative, associative and distributive laws. That is, if X, y, and z are arbitrary complex 
numbers, we have the following. 

Commutative laws; x + y = y + x, xy = yx . 

Associative laws:x -j- (y + z) = (x + y) + z, x(yz ) = ( xy)z . 

Distributive law: x(y + z) = xy + xz ■ 

Proof. All these laws are easily verified directly from the definition of sum and product. 
For example, to prove the associative law for multiplication, we write x = (x t , x,), 
y = (y, , j 2 ), z = (z 1 , z 2 ) and note that 


x(yz) = (at , atXjtzi - , yc 2 + y 2 z i) 

= (*i(Ti z i “ JVa) - x 2 (y x z 2 + y 2 Zj), xfy x z 2 + y 2 zf) + xfy^ - y 2 z 2 )) 

~ ((-'T.Fi — — ( x iV‘i + x 2fi) z 2 ! (Ti>2 + x ‘i}\) z \ + (ai>T — A 2 _y 2 )z 2 ) 

= (Xi}\ - x 2 y 2 , x Y y 2 + x 2 jt)(z 1 , Z 2 ) = (xy)z . 

The commutative and distributive laws may be similarly proved. 

Theorem 9.1 shows that the set of all complex numbers satisfies the first three field 
axioms for the real number system, as given in Section 1 3.2. Now we will show that 
Axioms 4, 5, and 6 are also satisfied. 

SiilCe (0, 0) + (a, b) = (a, b) for all complex numbers (a, b), the complex number (0, 0) 
is an identity element for addition. It is called the zero complex number. Similarly, the 
complex number (1,0) is an identity for multiplication because 

{a, b)(\, 0) = (a, b) 

for all (a, b). Thus, Axiom 4 is satisfied with (0, 0) as the identity for addition and (1, 0) 
as the identity for multiplication. 

To verify Axiom 5, we simply note that (-a, -b) + (a, b) = (0, 0), so (-a, -b) is the 
negative of (a, b). We write fa, b) for (-a, -b). 

Finally, we show that each nonzero complex number has a reciprocal relative to the 
identity element (1, 0). That is, if (a, b) (0, 0), there is a complex number (c, d) such that 


(a, b)(c, 0= (1,0) . 


In fact, this equation is equivalent to the pair of equations 


ac — bd= 1, 


ad + be = 0 , 
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which has the unique solution 


(9.1) 


n 2 + b 2 ’ 


d — 


-b 

a 2 + b 2 ' 


The condition (a, b) ^ (0, 0) ensures that a 2 + b 2 ^ 0, so the reciprocal is well defined. 
We write (a, by 1 or 1 l(a,b) for the reciprocal of (a, b). Thus, we have 


(9.2) 


1 

(a, b) 


a - b \ 

. a 2 + b 2 ’ a 2 + bV 


if (a, b) ^ (0, 0) . 


The foregoing discussion shows that the set of all complex numbers satisfies the six 
field axioms for the real-number system. Therefore, all the laws of algebra deducible from 
the field axioms also hold for complex numbers. In particular. Theorems 1.1 through 1.15 
of Section 1 3.2 are all valid for complex numbers as well as for real numbers. Theorem 
1.8 tells us that quotients of complex numbers exist. That is, if (a, b) and (c, d) are two 
complex numbers with (a, b) ^ (0, 0), then there is exactly one complex number (x, y) 
such that (a, b)(x, y) = (c, d). In fact, we have (x, y) = (c, d)((t, b)~ l . 


9.3 The complex numbers as an extension of the real numbers 

Let C denote the set of all complex numbers. Consider the subset C 0 of C consisting of 
all complex numbers of the form (a, 0), that is, all complex numbers with zero imaginary 
part. The sum or product of two members of C 0 is again in C,. In fact, we have 

(9.3) (a, 0) + (b, 0) = (a + b, 0) and (a, 0 )(b, 0) = (ab, 0) . 

This shows that we can add or multiply two numbers in C 0 by adding or multiplying the 
real parts alone. Or, in other words, with respect to addition and multiplication, the 
numbers in C 0 act exactly as though they were real numbers. The same is true for 
subtraction and division, since -(a, 0) = (-a, 0) and (b, 0) 1 = (b -1 , 0) if b ^ 0. For this 
reason, we ordinarily make no distinction between the real number x and the complex 
number (x, 0) whose real part is x; we agree to identify x and (x, 0), and we write x = (x, 0). 
In particular, we write 0 = (0, 0), 1 = (1, 0), 1 = (- 1, 0), and so on. Thus, we can 

think of the complex number system as an extension of the real number system. 

The relation between C„ and the real-number system can be described in a slightly 
different way. Let R denote the set of all real numbers, and letfdenote the function which 
maps each real number x onto the complex number (x, 0). That is, if x 6 R, let 


fix) = (x, 0) . 


The function!' so defined has domain R and range C 0 , and it maps distinct elements of R 
Onto distinct elements of C,. Because of these properties, /is said to establish a one-to-one 
correspondence between R and C,. The operations of addition and multiplication are 
preserved under this correspondence. That is, we have 

f{a+ b) =f(a) +f(b) and f(ab)=f(a)f(b ), 

these equations being merely a restatement of (9.3). Since R satisfies the six field axioms. 
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the same is true of C,. The two fields R and C 0 are said to be isomorphic; the function f 
which relates them as described above is called an isomorphism. As far as the algebraic 
operations of addition and multiplication are concerned, we make no distinction between 
isomorphic fields. That is why we identify the real number x with the complex number 
(x, 0). The complex-number system C is called an extension of the real-number system R 
because it contains a subset C 0 which is isomorphic to R. 

The field C 0 can also be ordered in such a way that the three order axioms of Section I 3.4 
are satisfied. In fact, we simply define (x, 0) to be positive if and only if x > 0. It is trivial 
to verify that Axioms 7, 8, and 9 are satisfied, SO C (J is an ordered field. The isomorphism 
f described above also preserves order since it maps the positive elements of R onto the 
positive elements of C 0 . 

9.4 The imaginary unit i 

Complex numbers have some algebraic properties not possessed by real numbers. For 
example, the quadratic equation x 2 + 1 = 0, which has no solution among the real 
numbers, can now be solved with the use of complex numbers. In fact, the complex 
number (0, 1) is a solution, since we have 

(0, l) 2 = (0, 1)(0, 1) = (0.0 - 1 ■ 1, 0 • 1 + 1 • 0) = (-1, 0) = -1. 

The complex number (0, 1) is denoted by / and is called the imaginary unit. It has the 

property that its square is — 1, / 2 = — 2. The reader can easily verify that (— /) 2 = — 1, 
SO x = -/ is another solution of the equation x 2 + 1 = 0. 

Now we can relate the ordered-pair idea with the notation used by the early mathe- 
maticians. First we note that the definition of multiplication of complex numbers gives 
us (b, 0)(0, 1) = (0, h), and hence we have 

(a, b ) = (a, 0) + (0, b) = (a, 0) + (b, 0)(0, 1) . 

Therefore, if we write a = (a, 0), b = (b, 0 ), and / = (0, 1), we get (a, b) = a + bi. In 

other words, we have proved the following. 

THEOREM 9.2. Every complex number (a, b) can be expressed in the form (a, b) = a + bi. 

The advantage of this notation is that it aids us in algebraic manipulations of formulas 
involving addition and multiplication. For example, if we multiply a + bi by c + di, 
using the distributive and associative laws, and replace j 2 by — 1, we find that 

(a + bi)(c + di) = ac — bd + (ad + bc)i , 

which, of course, is in agreement with the definition of multiplication. Similarly, to 
compute the reciprocal of a nonzero complex number a + bi, we may write 

1 a — bi z a .a. . .bi . 

a + bi (a + bi)(a — bi) a 2 + b 2 a 2 + b 2 a 2 + b 2 

This formula is in agreement with that given in (9.2). 
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By the introduction of complex numbers, we have gained much more than the ability 
to solve the simple quadratic equation x 2 + 1 = 0. Consider, for example, the quadratic 
equation ax 2 + bx + c = 0, where a, b, c are real and a ^ 0. By completing the square, 
we may write this equation in the form 



+ ^Lz ^ 2 = 0 . 

4a 2 


If 4 ac — b 2 < 0, the equation has the real roots (-b ± Vb 2 — 4ac)/(2a). If 4(1C — b 2 > 0, 
the left member is positive for every real x and the equation has no real roots. In this case, 
however, there are two complex roots, given by the formulas 


(9.4) 


r i 


b . V 4ac — b 2 

h f 

2a 2a 


and C 


= - -- 

2a 


ac 


2 a 


In 1799, Gauss proved that every polynomial equation of the form 
a 0 + a x x + a 2 x 2 + • • • + a n x n = 0 , 

where a„ a, , . . . , a, are arbitrary real numbers, with a, ^ 0, has a solution among the 
complex numbers if II > 1. Moreover, even if the coefficients a„ a, , . . . , a, are complex, 
a solution exists in the complex-number system. This fact is known as the jiindamental 
theorem Of algebra It shows that there is no need to construct numbers more general 
than complex numbers to solve polynomial equations with complex coefficients. 


9.5 Geometric interpretation. Modulus and argument 

Since a complex number (x, y) is an ordered pair of real numbers, it may be represented 
geometrically by a point in the plane, or by an arrow or geometric vector from the origin 
to the point (x, y), as shown in Figure 9.1. In this context, the .rt’-plane is often referred 
to as the complex plane. The x-axis is called the real axis; the y-axis is the imaginary axis. 
It is customary to use the words complex number and point interchangeably. Thus, we 
refer to the point z rather than the point corresponding to the complex number z. 

The operations of addition and subtraction of complex numbers have a simple geometric 
interpretation. If two complex numbers z x and z 2 are represented by arrows from the 
origin to z 1 and z 2 , respectively, then the sum z 1 + z 2 is determined by the parallelogram 
Zaw. The arrow from the origin to z 1 + z 2 is a diagonal of the parallelogram determined 
by 0, z 1 , and z 2 , as illustrated by the example in Figure 9.2. The other diagonal is related 
to the difference of z x and z 2 . The arrow from z x to z 2 is parallel to and equal in length to 
the arrow from 0 to z 2 — z x ; the arrow in the opposite direction, from z 2 to z x , is related 
in the same way to z x — Z 2 • 


t A proof of the fundamental theorem of algebra can he found in almost any book on the theory of functions 
of a complex variable. For example, see K. Knopp, Theory of Functions, Dover Publications, New York, 
1945, or E. Hille, Analytic Function Theory, Vol. I, Blaisdell Publishing Co., 1959. A more elementary 
proof is given in 0. Schreier and E. Sperner, Introduction to Modern Algebra and Matrix Theory, Chelsea 
Publishing Company, New York, 1951. 
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If (x, y) ^ (0, 0), we can express x and y in polar coordinates, 

x = r cos 6, y = r sin e , 

and we obtain 

(9.5) X + iy = f (tos e + / sin 0) . 

The positive number f, which represents the distance of (x, y) from the origin, is called 
the modulus or absolute value of x + iy and is denoted by |x + iy\. Thus, we have 


I* + iy\ = Vx 2 + y 2 . 



Figure 9.1 Geometric representation of the Figure 9.2 Addition and subtraction of 
complex number x + iy. complex numbers represented geometrically 

by the parallelogram law. 

The polar angle 0 is called an argument of x + iy. We say an argument rather than the 
argument because for a given point (x, y) the angle 0 is determined only up to multiples 
of 277 , Sometimes it is desirable to assign a unique argument to a complex number. This 
may be done by restricting 0 to lie in a half-open interval of length 277 , The intervals 
[ 0 , 277 ) and (—77, 77] are commonly used for this purpose. We shall use the interval (—77, 77] 
and refer to the corresponding 0 as the principal argument of x + iy ! we denote this 0 by 
arg (x + iy). Thus, if x + iy ^ 0 and r = |* + iy\, we define arg (x + iy) to be the 
unique real 0 satisfying the conditions 

x = r cos e, y = r sin 0 , —77 < 0 < 77 . 

For the zero complex number, we assign the modulus 0 and agree that any real 0 may be 
used as an argument. 

Since the absolute value of a complex number z is simply the length of a line segment, it 
is not surprising to find that it has the usual properties of absolute values of real numbers. 
For example, we have 


|z| >0 if z ^ 0, and \z x — z a \ = |z 2 — zfi . 
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Geometrically, the absolute value |zj — z 2 1 represents the distance between the points z 1 
and z 2 in the complex plane. The triangle inequality 

l*i + Z 2 j < l*il + |z 2 l 

is also valid. In addition, we have the following formulas for absolute values of products 
and quotients of complex numbers: 

(9.6) l*i * 2 I = l*il N 

and 

*i 
*2 

If we write Zj = a + bi and z 2 = c + di, we obtain (9.6) at once from the identity 
(ac - bd)" + (be + ad f = (a" + b 2 )(c 2 + d 2 ) . 

The formula for \z 1 jz 2 \ follows from (9.6) if we write Zj as a product, 


l*il 

I 7.1 


if *2 ^ 0 . 


*i 


= z„ 


*i 

z 2 ' 


If z = x -j- iy, the complex conjugate of z is the complex number z = x — iy. Geometri- 
cally, z represents the reflection of z through the real axis. The definition of conjugate 
implies that 


*1 + *2 = *i + *2 ) * 1*2 = * 1*2 , * l /*2 = Zl /* 2 , ** = I *! 2 ■ 


The verification of these properties is left as an exercise for the reader. 

If a quadratic equation with real coefficients has no real roots, its complex roots, given 
by (9.4), are conjugates. Conversely, if r x and r 2 are complex conjugates, say /q = a + // 3 
and r 2 = a — z/3, where a and ft are real, then r 1 and r 2 are roots of a quadratic equation 
with real coefficients. In fact, we have 


SO 


r 1 + r 2 = 2a and r x r 2 = a 2 + /J 2 , 
(x - rJU - r 2 ) = x 2 (r x + r 2 )x + r x r 2 , 


and the quadratic equation in question is 

x 2 — 2«X + a 2 + /3 2 = 0. 
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9.6 Exercises 

1. Express the following complex numbers in the form a + bi. 
(a) (1 + if. (e) (1 + 0/(1 ■ 

(f) f° + i 16 . 


2i). 


(b) tfi. 

(c) 1/(1 + i). (g) 1 + i + i 2 + i s . 

(d) (2 + 3/)(3 - 40 . (h) i(l + 0(1 + <~ 8 ). 

2. Compute the absolute values of the following complex numbers. 

(d) 1 + i + i 2 . 


(e) 


V + i 1 


(a) 1 + i. 

(b) 3 + 4i. 

(c) (1 + 0/0 ~ 0. (1) 2(1 - 0 + 3(2 + i). 

3. Compute the modulus and principal argument of each of the following complex numbers. 

(a) 2i. (f) (1 + 0/V2. 

(b) -3i. (g) (-1 + if. 

(c) -1. (h) (-1 - if. 

(d) 1. (i) 1/(1 + i). 

(e) -3 + V3 i. (j) 1/0 + if. 

4. In each case, determine all real numbers x and y which satisfy the given relation. 

(d) (x + iyf = (x - iyf. 
x + iy 

(e) £ =x -iy. 


(a) x + iy - x — iy. 

(b) x+ iy - \x + iy\- 


(=) \x + '>1 = \x - iy\. 


x 

100 


iy 


(f) y i k = x + iy. 
h = 0 

set of all z in the complex plane which satisfy each of the following 


l| - U + 11 
" = \z+ i\. 

= It|2 


5. Make a sketch showing the 
conditions. 

(a) |z| <1. (d) 

(b) z + z = 1. (e) | z - /| 

(c) z — z = i. (f) z + z 

6. Let f be a polynomial with real coefficients. 

(a) Show that f(z) = f(i) for every complex z. 

(b) Use part (a) to deduce that the nonreal zeros of f (if any exist) must occur in pairs of con- 
jugate complex numbers. 

7. Prove that an ordering relation cannot be introduced in the complex number system so that 
all three order axioms of Section 13.4 are satisfied. 


[Hint: Assume that such an ordering can be introduced and try to decide whether the 
imaginary unit i is positive or negative.] 

8. Define the following “pseudo-ordering” among the complex numbers. If z = x -f iy, we say 
that z is positive if and only if x > 0. Which of the order axioms of Section 13.4 are satisfied 
with this definition of positive? 

9. Solve Exercise 8 if the pseudo-ordering is defined as follows: We say that z is positive if and 
only if |z| > 0. 

10. Solve Exercise 8 if the pseudo-ordering is defined as follows: If z = x + iy, we say that z is 
positive if and only if x > y. 

11. Make a sketch showing the set of all complex z which satisfy each of the following conditions. 

(a) |2z + 3] < 1. (c) |z - /| < |z + il. 

(b) |z + 1| < \z - 1|, (d) |z| < |2z + 1|. 

12. Let w = (az -j- b)l(cz + d), where a, b, c, and d are real. Prove that 

w - w - (ad — bc)(z — z)l\cz + d\ 2 . 

If ad —bc> 0, prove that the imaginary parts of z and tv have the same sign. 
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9.7 Complex exponentials 

We wish now to extend the definition of e x so that it becomes meaningful when x is 
replaced by any complex number z. We wish this extension to be such that the law of 
exponents, e a e l = e a+b , will be valid for all complex a and b. And, of course, we want e z 
to agree with the usual exponential when z is real. There are several equivalent ways to 
carry out this extension. Before we state the definition of e z that we have chosen, we shall 
give a heuristic discussion which will serve as motivation for this definition. 

If we write z = x + iy, then, if the law of exponents is to be valid for complex numbers, 
we must have 

e z — e x+iv = e x e iy 

Since e* has already been defined when x is real, our task is to arrive at a reasonable 
definition for e iy when y is real. Now, if e iy is to be a complex number, we may write 

(9.7) e iv = A(y ) + iB{y) , 

where A and B are real-valued functions to be determined. Let us differentiate both sides 
of Equation (9.7), assuming A and B are differentiable, and treating the complex number 
l as though it were a real number. Then we get 

(9.8) ie iy = A’(y) + iB'(y) ■ 

Differentiating once more, we find that 

~e iv = A”(y) + iB"(y) . 

Comparison of this equation with (9.7) shows that A and B must satisfy the equations 

A"(y) - -A(y) and B"(y) = -B(y) . 

In other words, each of the functions A and B is a solution of the differential equation 
f" + f = 0. From the work of Chapter 8, we know that this equation has exactly one 
solution with specified initial values f (0) and f ‘(0). If we put y = 0 in (9.7) and (9.8) and 
use the fact that e° = 1, we find that A and B have the initial values 

A(0) = 1, A’(0) = 0, and B(0) = 0, B'(0) = 1 . 

By the uniqueness theorem for second-order differential equations with constant coefficients, 
we must have 

A(y) = cosy and B(y) = sin y . 

In other words, if e' y is to be a complex number with the properties just described, then 
we must have e ,v = cos y + i sin y. This discussion serves to motivate the following 
definition. 
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definition. If z - x + iy, we define e z to be the complex number given by the equation 


(9.9) e z = e*(cos y + / sin y) . 

Note that e z = e x when y = 0; hence this exponential agrees with the usual exponential 
when z is real. Now we shall use this definition to deduce the law of exponents. 

theorem 9.3. if a and b are complex numbers, we have 

(9.10) e a e h = e a+i 

Proof. Writing a = x + iy and b = u + iv, we have 

e a = e x (cos y + / sin y), e b = e“(cos v + / sin v) , 

SO 

e a e b = e r e u [cos y cos v — sin y sin V + i(C0S y sin v + sin y cos u)] . 

Now we use the addition formulas for cos (y + v) and sin (y + v) and the law of exponents 
for real exponentials, and we see that the foregoing equation becomes 

(9.11) e a e b = e x+ “[cos (y + v) + i sin ( y + t>)] > 

Since a + b = (x + u) + i(y + v), the right member of (9.11) is e a+i . This proves (9.10). 

theorem 9.4. Every complex number z ^ 0 can be expressed in the form 

(9.12) z = ref 

where r = |z| and 0 = arg (z) + 2m, n being any integer. This representation is called the 
polar form ofz. 

Proof. If z = x + iy, the polar-coordinate representation (9.5) gives us 

z = r(cos 6 + i sin 0) , 

where r = |z| and 0 = arg (z) + 2mr, n being any integer. But if we take X = 0 and y = 6 
in (9.9), we obtain the formula 

e l9 = cos 6 + i sin 0, 

which proves (9.12). 

The representation of complex numbers in the polar form (9.12) is especially useful in 
connection with multiplication and division of complex numbers. For example, if Zj = r x e ie 
and z 2 = r 2 e llb , we have 

(9.13) ZjZ 2 = r l e ia r. i e i,t ’ = r^e’ 16 ^ 
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Therefore the product of the moduli, r x r 2 , is the modulus of the product z x z 2 , in agreement 
with Equation ( 9 . 6 ), and the sum of the arguments, 6 + <f>, is an admissible argument for 
the product ZjZ 2 . 

When z = re te , repeated application of (9.13) gives us the formula 
z n — r n e tn0 = r n ( cos nO + i sin nB ) , 

valid for any nonnegative integer n. This formula is also valid for negative integers n if 
we define Z~ m to be (z -1 ) m when m is a positive integer. 

Similarly, we have 


Z J 

z 3 


r x e 


id 


r 9 e 


,i4> 


£i g ue-<i>) 
rt 




so the modulus of z x /z 2 is r 1 /r 2 and the difference 0 “ </> is an admissible argument for z x /z 2 . 


9.8 Complex-valued functions 

A function f whose values are complex numbers is called a complex-valued function. 

If the domain off is a set of real numbers, f is called a complex-valued function of a real 
variable. If the domain is a set of complex numbers, /is called a complex- valued function 
of a complex variable, or more simply, a function of a complex variable. An example is 
the exponential function, defined by the equation 

//) = 

for all complex z. Most of the familiar elementary functions of calculus, such as the 
exponential, the logarithm, and the trigonometric functions, can be extended to become 
functions of a complex variable. (See Exercises 9 and 10 in Section 9.10.) In this more 
general framework many new properties and interrelationships are often revealed. For 
example, the complex exponential function is periodic. In fact, if z = x + iy and if n 
is any integer, we have 

e z+znvi _ e *[ cos (y + 2mr) + i sin (y + 2mr)\ = e*(C0S y + / siny) = e z ■ 

Thus we see that /(z + Invi) =/(z), sof has the period Ini, This property of the expo- 
nential function is revealed only when we study the exponential as a function of a complex 
variable. 

The first systematic treatment of the differential and integral calculus of functions of 
a complex variable was given by Cauchy early in the 19th Century. Since then the theory 
has developed into one of the most important and interesting branches of mathematics. 
It has become an indispensable tool for physicists and engineers and has connections in 
nearly every branch of pure mathematics. A discussion of this theory will not be given 
here. We shall discuss only the rudiments of the calculus of complex-valued functions of a 
real variable. 
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Suppose f is a complex-valued function defined on some interval I of real numbers. For 
each x in /, the function value f(x ) is a complex number, so we can write 

fix ) = u(x) + iv(x) , 

where u(x) and v{x) are real. This equation determines two real-valued functions u and v 
called, respectively, the real and imaginary parts off; we write the equation more briefly 
as f = u + iv. Concepts such as continuity, differentiation, and integration off may be 
defined in terms of the corresponding concepts for u and v, as described in the following 
definition. 


definition. Iff — u + iv, we say f is continuous at a point if both u and v are con- 
tinuous at that point. The derivative off is defined by the equation 

f’(x) = u’(x) + iv'(x) 

whenever both derivatives u’(x) and v’(x) exist. Similarly, we define the integral off by the 
equation 

6 f(x) dx = F u(x) dx + i F v(x) dx 

Ja 

whenever both integrals on die right exist. 


In view of this definition, it is not surprising to find that many of the theorems of differ- 
ential and integral calculus are also valid for complex-valued functions. For example, the 
rules for differentiating sums, products, and quotients (Theorem 4.1) are valid for complex 
functions. The first and second fundamental theorems of calculus (Theorems 5.1 and 5.3) 
as well as the zero-derivative theorem (Theorem 5.2) also hold for complex functions. T£> 
illustrate the ease with which these theorems can be proved, we consider the zero-derivative 
theorem : 

If f(x) = Gbrfa It on an open interval /, then f is constant on /, 

Proof. Write f = u + iv. Since f = u + iv’, the statement / = 0 on Z means that 
both u‘ and v' are zero on /. Hence, by Theorem 5.2, both u and v are constant on /. 
Therefore f is constant on Z. 


9.9 Examples of differentiation and integration formulas 

In this section we discuss an important example of a complex-valued function of a real 
variable, namely the function f defined for all real x by the equation 

fix) = e tx , 

where t is a fixed complex number. When t is real, the derivative of this function is given 
by the formula f ‘(x) — te tx . Now we prove that this formula is also valid for complex t. 
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theorem 9 . 5 . If f(x) = e tx for all real x and a fixed complex t, thenf’(x) = te lx . 

Proof. Write t = a + if where a and fj are real. From the definition of the complex 
exponential, we have 


f(x) = e tx = = e xx cos fx + ie xx sill fx • 

Therefore, the real and imaginary parts off are given by 

(9.14) u(x) = e ax COS fix and v(x) = e xx sin fix ■ 

These functions are differentiable for all X and their derivatives are given by the formulas 


U’(x) = COS fix — fSe* x sin fix , v’(x) = oie ax sin fix + fle ax cos fix . 

Since f’(x) = u’(x) + iv’(x), we have 


f (x) = ae a:E (cos fix + i sin fix) + ifle ax ( COS fix + i sin fix) 

= (a + ip)e ( “ +ill)x = te tx . 

This completes the proof. 

Theorem 9.5 has some interesting consequences. For example, if we adopt the Leibniz 
notation for indefinite integrals, we can restate Theorem 9.5 in the form 


(9.15) 



Ax 


when t ^ 0. If we let t = a + if and equate the real and imaginary parts of Equation 
(9.15), we obtain the integration formulas 


and 


J e xx cos flxdx = 

f e™ sin fix dx = 


e M (a cos fx+ f) sin fix) 
a 2 + f 2 

e* x ( r J. sin fix — (I cos fix) 
a 2 + f 2 


9 


which are valid if a and f are not both zero. 

Another consequence of Theorem 9.5 is the connection between complex exponentials 
and second-order linear differential equations with constant coefficients. 


THEOREM 9.6. Consider the differential equation 
(9.16) y" + ay' + by = 0 , 

where a and bare real constants. Thereat and imaginary parts of the function f definedon 




Exercises 


371 


( - co, + oo) by the equation f(x) = e tx are solutions of the differential equation (9.16) if 
and only ft is a root of the characteristic equation 

t 2 + at + b = 0 , 

Proof. Let L(y) = y” + ay' + by. Since f(x) = te ix , we also have f"(x) = t 2 e u , so 
iff) = e tx (t 2 + at + b). But e tx is never zero since e tx e~ tx = e° = 1. Hence, iff) = 0 
if and only if t 2 + at + b = 0. But if we write f = u + iv, we find L(f) = £(«) + iL(v), 
and hence L(f) = 0 if and only if both L(u) = 0 and L(v) = 0. This completes the proof. 

Note: If t = ol+ //?, the real and imaginary parts off are given by (9.14). If the 

characteristic equation has two distinct roots, real or complex, the linear combination 

y = C x u(x) + c 2 v(x) 

is the general solution of the differential equation. This agrees with the results proved 
in Theorem 8.7. 

Further examples of complex functions are discussed in the next set of exercises. 


9.10 Exercises 


1. Express each of the following complex numbers in the form a + hi. 

(a) e T</ 2. (e) i + e 2H . 

(b) 2e _,ri/2 , (f) e” !i 


(c) 3e ni , 

(d) - e -«. 


(g) e vili . — e~ Tili 

1 DM girt/2 

( h ) 1 + e *m ' 


2. In each case, find all real x and y that satisfy the given relation. 

(a) x + iy = xe iv . (c) = -1. 

(b) x + iy = ye ix . 1 + / 

( d ) — 


= xe. 


3. (a) Prove that e z ^ 0 for all complex z. 
(b) Find all complex z for which e z = 1. 

4. (a) If 0 is real, show that 


„id 


+ e 


and 


-if) 


sin 6 = 


2i 


(b) Use the formulas in (a) to deduce the identities 

cos 2 6 = 1(1 + COS 20), sin 2 0 = 1(1 - cos 20) • 

5. (a) Prove DeMoivre's theorem, 

( COS 6 + i sin 0) n = cos n6 + i sin nd , 

valid for every real 0 and every positive integer n. 

(b) Take n = 3 in part (a) and deduce the trigonometric identities 

sin 30=3 cos 2 0 sin 0 — sin 3 0, cos 30 = cos 3 0 — 3 cos 0 sin 2 0 . 
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6. Prove that every trigonometric sum of the form 


n 

SJx) = \a 0 + ^ (a k cos kx + b k sin kx) 
k = i 

can be expressed as a sum of complex exponentials, 


n 

SJx) = X c k e ikx , 

k = - n 


where c k = \{a k — ib k ) for k = 1,2,..., n. Determine corresponding formulas for c_ k . 
7. (a) If m and n are integers, prove that 


r 2 ir 

Jo 


'■ dx = 


0 

2tt 


if m 5 ^ n , 
if m — n . 


(b) Use part (a) to deduce the orthogonality relations for the sine and cosine (m and n are 
integers, rr? ^ n 2 ): 


\ 2 " sinnx cos mx dx = f 2t sin nx sin mx dx = \ cosnxcosmxdx = 0, 

jn J'fl Jo 

J 21 sin 2 nx dx = J 217 cos 2 nx dx - tt if n ^ 0 . 

8 . Given a complex number z jt 0. Write % = re i0 , where 6 = arg(z). Let z 1 = Re la , where 
fi = )Xln an( j a = ()j n ^ an( j i e t £ = e 2 vi/n, w it ere n is a positive integer. 

(a) Show that zj 1 = z; that is, zj is an nth root of z. 

(b) Show that z has exactly n distinct nth roots, 

z i > ez i i e2z i i ' i ' > £ n lz i ) 

and that they are equally spaced on a circle of radius R . 

(c) Determine the three cube roots of /. 

(d) Determine the four fourth roots of /, 

(e) Determine the four fourth roots of -/. 

9. The definitions of the sine and cosine functions can be extended to the complex plane as 
follows : 

e " + e~ iz ■ e<z - 

COS z = , sin z = 2j — i 

When z is real, these formulas agree with the ordinary sine and cosine functions. (See Exercise 
4.) Use these formulas to deduce the following properties of complex sines and cosines. Here 
U, v, and z denote complex numbers, with z = x + iy. 

(a) sin(«+ v) = sin u COS v + cos u sin v. 

(b) cos (u + v) = cos u cos v - sin u sin v. 

(c) sin 2 z + cos 2 z = 1 . 

(d) cos ( iy)= coshy, sin (iy) = / sinh y. 

(e) cos z = cos x cosh y ™ / sin x sinh y . 

(f) sin z = sin x cosh y + / cos x sinh y. 
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10. If z is a nonzero complex number, we define Log z, the complex logarithm of z, by the equation 

Log z = log \z\ + i argfz) . 

When z is real and positive, this formula agrees with the ordinary logarithm. Use this formula 
to deduce the following properties of complex logarithms. 

(a) Log (- 1) = ni. Log (i) = mj2. 

(b) Log (zjZ 2 ) = Log Zj + Log z 2 + 2mri, where n is an integer. 

(c) Log (zjz 2 ) = Log Zj - Log z 2 + 2nni, where n is an integer. 

(d) e Log z = z . 

11. If w and z are complex numbers, z ^ 0, we define z’ 1 ' by the equation 

z w ~ e w Log z 

where Log z is defined as in Exercise 10. 

(a) Compute 1*, i\ and ( — 1)’. 

(b) Prove that z a z i = z a+b if a, b, and z are complex, z 0. 

(c) Note that the equation 

(9.17) (V 2 f = zpz 

is violated when z t = z 2 = -1 and w = /. What conditions on Zj and z 2 are necessary for 
Equation (9.17) to hold for all complex w? 

In Exercises 12 through 15, L denotes the linear operator defined by L(y) = y” + ay' + by, 
where a and b are real constants. 

12. Prove that if R is a complex- valued function, say R (x) = P(x) + iQ(x), then a complex-valued 
function f(x) = u(x) + iv(x) satisfies the differential equation L(y) = R(x) on an interval Z 
if and only if « and v satisfy the equations L(u) = P(x) and L(v) =Q(x) on Z. 

13. If A is complex and to is real, prove that the differential equation L{y) — Ae iax has a complex- 
valued solution of the form y = Be itox , provided that either b ^ to 2 or aoj ^ 0. Express the 
complex number B in terms of a, b, A, and to. 

14. Assume c is real and b A- to 2 . Use the results of Exercise 13 to prove that the differential 
equation Hy)~ c cos (OX has a particular solution of the form y = A cos (ax — a), where 


A = 


\/(b - w 2 ) 2 + a 2 w 2 


and 


tana = 


aw 

b — co 2 


15. Assume c is real and b co 2 . Prove that the differential equation L(y) = c sin cox has a par- 
ticular solution of the form y = A sin (ax + 3) and express A and a in terms of a, b, c, and co. 
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SEQUENCES, INFINITE SERIES, 
IMPROPER INTEGRALS 


10.1 Zeno’s paradox 

The principal subject matter of this chapter had its beginning nearly 2400 years ago 
when the Greek philosopher Zeno of Elea (495-435 b.c.) precipitated a crisis in ancient 
mathematics by setting forth a number of ingenious paradoxes. One of these, often called 
the racecourse paradox, may be described as follows: 

A runner can never reach the end of a racecourse because he must cover half of any 
distance before he covers the whole. That is to say, having covered the first half he 
still has the second half before him. When half of this is covered, one-fourth yet 
remains. When half of this one-fourth is covered, there remains one-eighth, and so 

on, adinjnitum. 

Zeno was referring, of course, to an idealized situation in which the runner is to be 
thought of as a particle or point moving from one end of a line segment to the other. W e 
Can formulate the paradox in another way. Assume that the runner starts at the point 
marked 1 in Figure 10.1 and runs toward the goal marked 0. The positions labeled ^ , J, 
J, etc., indicate the fraction of the course yet to be covered when these points are reached. 
These fractions, each of which is half the previous one, subdivide the whole course into an 
endless number of smaller portions. A positive amount of time is required to cover each 
portion separately, and the time required for the whole course is the sum total of all these 
amounts. Tb say that the runner can never reach the goal is to say that he never arrives 
there in a finite length of time; or, in other words, that the sum of an endless number of 
positive time intervals cannot possibly be finite. 

This assertion was rejected 2000 years after Zeno’s time when the theory of infinite 
series was created. In the 17th and 18th centuries, mathematicians began to realize that it 
IS possible to extend the ideas of ordinary addition from finite collections of numbers to 
inznite collections so that sometimes infinitely many positive numbers have a finite “sum.” 
To see how this extension might corne about and to get an idea of some of the difficulties 
that might be encountered in making the extension, let us analyze Zeno’s paradox in more 
detail. 

Suppose the aforementioned runner travels at a constant speed and suppose it takes him 
T minutes to cover the first half of the course. The next quarter of the course will take 


3 74 
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772 minutes, the next eighth will take Tj4 minutes, and, in general, the portion from 
1/2" to l/2 n+1 will take Tj2 n minutes. The “sum” of all these time intervals maybe indi- 
cated symbolically by writing the following expression: 

T T T 

(10.1) T+ - + -+..■ + -+■■■ 

This is an example of what is known as an infinite series, and the problem here is to decide 
whether there is some reasonable way to assign a number which may be called the sum of 
this series. 

Our physical experience tells us that a runner who travels at a constant speed should 
reach his goal in twice the time it takes for him to reach the halfway point. Since it takes 



Figure 10.1 The racecourse paradox. 

T minutes to COVCf half the course, it should require 27' minutes for the whole course. 
This line of reasoning strongly suggests that we should assign the “sum” 2T to the series 
in (10.1), and it leads us to expect that the equation 

T T T 

(10.2) T+ ! + -+•■ ■+-+•■. = 2T 


should be “true” in some sense. 

The theory of infinite series tells us exactly how to interpret this equation. The idea is 
this: First we add a finite number of the terms, say the first n, and denote their sum by s,. 
Thus we have 


(10.3) s * =T+ I + I + "- + ^i- 

This is called the nth partial sum of the series. Now we study the behavior of s n as n takes 
larger and larger values. In particular, we try to determine whether the partial sums S n 
approach a finite limit as n increases without bound. 

In this example it is easy to see that 2T is the limiting value of the partial sums. In 
fact, if we calculate a few of these partial sums, we find that 



S3 — T + 



. T . T T 15 T 

s * = T+ 2 + 4 + 8 = J T ' 


Si =T 
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Now, observe that these results may be expressed as follows: 

s, = (2 - 1)7, s i = (2 - \)T , 5, = (2 - |)7, = (2 - 4)7. 

This leads us to conjecture the following general formula: 


(10.4) 


Sn 



for all positive integers n . 


Formula (10.4) is easily verified by induction. Since l/2 n_1 — ► 0 as n increases indefinitely, 
this shows that s n —> 2T. Therefore, Equation (10.2) is “true” if we interpret it to mean that 
2T is the limit of the partial sums s,. This limit process seems to invalidate the assertion 
that the sum of an infinite number of time intervals can never be finite. 

Now we shall give an argument which lends considerable support to Zeno’s point of 
view. Suppose we make a small but important change in the foregoing analysis of the 
racecourse paradox. Instead of assuming that the speed of the runner is constant, let us 
suppose that his speed gradually decreases in such a way that he requires 7 minutes to 
go from 1 to 1/2, 7/2 minutes to go from 1/2 to 1/4, 7/3 minutes to go from 1/4 to 1/8, 
and, in general, Tjn minutes to go from 1 /2 n— 1 to 1/2". The “total time” for the course 
may now be represented by the following infinite series: 


7 7 7 

(10.5) T + - + t + " ■ j 

2 3 n 

In this case, our physical experience does not suggest any natural or obvious “sum” to 
assign to this series, and hence we must rely entirely on mathematical analysis to deal with 
this example. 

Let us proceed as before and introduce the partial sums s n . That is, let 

7 7 7 

(10.6) s " ==T+: 2 + 3' + '’‘ + n"‘ 

Our object is to decide what happens to s n for larger and larger values of n. These partial 
sums are not as easy to study as those in (10.3) because there is no simple formula analogous 
to (10.4) for simplifying the expression on the right of (10.6). Nevertheless, it is easy to 
obtain an estimate for the size of s n if we compare the partial sum with an appropriate 
integral. 

Figure 10.2 shows the graph of the function f(x) = 1/x for x > 0. (The scale is distorted 
along the y-axis.) The rectangles shown there have a total area equal to the sum 

(10.7) 1 + i + ^ + " , + i- 

The area of the shaded region is J J +1 r‘ dx = log ( n + 1). Since this area cannot exceed 
the sum of the areas of the rectangles, we have the inequality 


1 + 2 + j + ' ‘ ~ ^ log (n + 1) . 


( 10 . 8 ) 




Zeno's paradox 


377 


Multiplying both sides by T, we obtain s n > Tlog (n + 1). In other words, if the runner’s 
speed decreases in the manner described above, the time required to reach the point 1/2" 
is at least T log (n + 1) minutes. Since log (n + 1) increases without bound as n increases, 
we must agree with Zeno and conclude that the runner cannot reach his goal in any finite 
time. 

The general theory of infinite series makes a distinction between series like (10.1) whose 
partial sums tend to a finite limit, and those like (10.5) whose partial sums have no finite 



limit. The former are called convergent, the latter divergent. Early investigators in the 
field paid little or no attention to questions of convergence or divergence. They treated 
infinite series as though they were ordinary finite sums, subject to the usual laws of algebra, 
not realizing that these laws cannot be universally extended to infinite series. Therefore, 
it is not surprising that some of the results they obtained were later shown to be incorrect. 
Fortunately, many of the early pioneers possessed unusual intuition and skill which 
prevented them from arriving at too many false conclusions, even though they could not 
justify all their methods. Foremost among these men was Leonard Euler who discovered 
One beautiful formula after another and at the same time used infinite series as a unifying 
idea to bring together many branches of mathematics, hitherto unrelated. The great 
quantity of Euler’s work that has survived the test of history is a tribute to his remarkable 
instinct for what is mathematically correct. 

The widespread use of infinite series began late in the 17th Century, nearly fifty years 
before Euler was bom, and coincided with the early development of the integral calculus. 
Nicholas Mercator (1620-1 687) and William Brouncker (1620-1 684) discovered an infinite 
series- for the logarithm in 1668 while attempting to calculate the area of a hyperbolic 
segment. Shortly thereafter, Newton discovered the binomial series. This discovery proved 
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to be a landmark in the history of mathematics. A special case of the binomial series is 
the now-familiar binomial theorem which states that 


(l+x)» = 

k=0 ' ' 


where x is an arbitrary real number, n is a nonnegative integer, and (") is the binomial 
coefficient. Newton found that this formula could be extended from integer values of 
the exponent n to arbitrary real values of n by replacing the finite sum on the right by a 
Suitable infinite series, although he gave no proof of this fact. Actually, a careful treatment 
of the binomial series raises some rather delicate questions of convergence that could not 
have been answered in Newton’s time. 

Shortly after Euler’s death in 1783, the flood of new discoveries began to recede and the 
formal period in the history of series came to a close. A new and more critical period 
began in 1812 when Gauss published a celebrated memoir which contained, for the first 
time in history, a thorough and rigorous treatment of the convergence of a particular 
infinite series. A few years later Cauchy introduced an analytic definition of the limit 
concept in his treatise Cours d’ analyse algebrique (published in 1821) and laid the founda- 
tions of the modern theory of convergence and divergence. The rudiments of that theory 
are discussed in the sections that follow. 


10.2 Sequences 

In everyday usage of the English language, the words “sequence” and “series” are 
synonyms, and they are used to suggest a succession of things or events arranged in some 
order. In mathematics these words have special technical meanings. The word “sequence” 
is employed as in the common use of the term to convey the idea of a set of things arranged 
in order, but the word “series” is use d in a somewhat different sense. The concept of a 
sequence will be discussed in this section, and series will be defined in Section 10.5. 

If for every positive integer n there is associated a real or complex number a„ then the 
ordered set 

i i ■ • • , i • • • 

is said to define an infinite sequence. The important thing here is that each member of 
the set has been labeled with an integer so that we may speak of the first term a, , the second 
term a 2 , and, in general, the nth term a, . Each term a, has a successor a,,, and hence 
there is no “last” term. 

The most common examples of sequences can be constructed if we give some rule or 
formula for describing the nth term. Thus, for example, the formula a, = Ijn defines a 
sequence whose first five terms are 


iiiii 
2r 3'- . 

Sometimes two or more formulas may be employed as, for example. 


a 2n-l - 7 


a 2n = 2 « 2 
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the first few terms in this case being 

1, 2, 1, 8, 1, 18, 1, 32, 1 . 

Another common way to define a sequence is by a set of instructions which explains how 
to carry on after a given start. Thus we may have 

a, = fl 2 = 1, a n+1 = a n + a n _ x for n >2 . 

This particular rule is known as a recursion formula, and it defines a famous sequence 
whose terms are called the Fibonacci t numbers. The first few terms are 

1, 1,2, 3, 5, 8, 13,21,34. 

In any sequence the essential thing is that there be some function f defined on the positive 
integers such that f(n) is the nth term of the sequence for each n = 1, 2, 3, .... In fact, 
this is probably the most convenient way to state a technical definition of sequence. 

definition. A jiinction f whose domain is the set of all positive integers 1, 2, 3, . . . is 
called an infinite sequence. The fimction value f ( 11 ) is called the nth tern of the sequence. 

The range of the function (that is, the set of function values) is usually displayed by writing 
the terms in order, thus: 

/(l),/(2),/(3)„ , „/(«)„ , , , 

For brevity, the notation (f(n)j is used to denote the sequence whose nth term is fin). 
Very often the dependence on n is denoted by using subscripts, and we write a„ s n , x n , u n , 
or something similar instead of fin). Unless otherwise specified, all sequences in this 
chapter are assumed to have real or complex terms. 

The main question we are concerned with here is to decide whether or not the terms 
fin) tend to a finite limit as n increases indefinitely. To treat this problem, we must extend 
the limit concept to sequences. This is done as follows. 

definition. A sequence (fin)} is said to have a limit L if, for every positive number e, 
there is another positive number N (which may depend on ® ) such that 

Ifin) — L\ < « for all n>N . 

In this case, we say the sequence {fin)j converges to L and we write 

lim/(n)= L, or fin) -> La s n -*■ oo . 
n->co 

A sequence which does not converge is called divergent. 

In this definition the function values f (n) and the limit L may be real or complex numbers. 

If / and L are complex, we may decompose them into their real and imaginary parts, say 
/ = u + iv and L = a + ib. Then we have fin) — L = u(n) — a + i[v(n) — b]. The 


t Fibonacci, also known as Leonardo of Pisa (circa 1175-1250), encountered this sequence in a problem 
concerning the offspring of rabbits. 
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inequalities 

Wi n ) — a\< | /(«) - L\ and \v(n) - b\< \f{n) - L\ 

show that the relation ^(n) — > L implies u(n) — »■ a and v(n) — > b as « -> oo. Conversely, the 
inequality 

|/(«) - 1 < |«(n) - a\ + |t>(n) - b\ 

shows that the two relations u(n) — > a and v(n) -> b imply f(n) — >■ L as « — * oo. In other 
words, a complex-valued sequence / converges if and only if both the real part u and the 
imaginary part v converge separately, in which case we have 

lim/(n) = lim u(n) + /lim v(n) . 
n—a> n~* oo n~* oo 

It is clear that any function defined for all positive real x may be used to construct a 
sequence by restricting x to take only integer values. This explains the strong analogy 
between the definition just given and the one in Section 7.14 for more general functions. 
The analogy carries over to infinite Zimits as well, and we leave it for the reader to define 
the symbols 

lim/(n) = +oo and lim f(n) = — oo 

n~* oo n~*co 

as was done in Section 7.15 when f is real-valued. If f is complex, we write f(n) — > oo as 
n -> oo if \f{n)\ ->■ +co. 

The phrase “convergent sequence” is used only for a sequence whose limit is finite. A 
sequence with an infinite limit is said to diverge. There are, of course, divergent sequences 
that do not have infinite limits. Examples are defined by the following formulas: 

fin) = (-1)" , f(n) = sin ~ , fin) = (-l) M (l + -) , fin) = e * in/2 . 

2 V nt 

The basic rules for dealing with limits of sums, products, etc., also hold for limits of 
convergent sequences. The reader should have no difficulty in formulating these theorems 
for himself. Their proofs are somewhat similar to those given in Section 3.5. 

The convergence or divergence of many sequences maybe determined by using properties 
of familiar functions that are defined for all positive x. We mention a few important 
examples of real-valued sequences whose limits may be found directly or by using some of 
the results derived in Chapter 7. 


(10.9) 

lim — = 0 
n->oo n a 

if a > 0 . 

(10.10) 

lim x n = 0 

n~* oo 

if W < 1 . 

(10.11) 

n~* co n 

= 0 for all a > 0, b > 0 

(10. 12) 

lim n 1,n ~ 1 
n-*oo 


(1043) 

lim ( + - ) 

n-*oo ' n / 

t 

= e a for all real a . 
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10.3 Monotonic sequences of real numbers 

A sequence { f(n> } is said to be increasing if 

f(n) < f(n + 1) for all n > l . 

We indicate this briefly by writing fin)/ . If, on the other hand, we have 

f{n) >/(n + 1) for all n > l 

we call the sequence decreasing and write ' (n)\ . A sequence is called monotonic if it is 

increasing or if it is decreasing. 

Monotonic sequences are pleasant to work with because their convergence or divergence 
is particularly easy to determine. In fact, we have the following simple criterion. 


THEOREM 10.1. A monotonic sequence converges if and only if it is bounded. 

Note A sequence {/(/?)} rs called bounded if there exists a positive number M such that 
l/(«)l < M for all n. A sequence that is not bounded is called unbounded. 

Proof. It is clear that an unbounded sequence cannot converge. Therefore, all we need 
to prove is that a bounded monotonic sequence must converge. 

Assume fin)/ and let L denote the least upper bound of the set of function values. 
(Since the sequence is bounded, it has a least upper bound by Axiom 10 of the real-number 

L — t L 


/(l) /( 2) /( 3) m f(N) Jin) 

Figure 10,3 A bounded increasing sequence converges to its least upper bound. 

system.) Then/(n) < L for all W, and we shall prove that the sequence converges to L. 

Choose any positive number e, Since L — e cannot be an upper bound for all numbers 
f(n), we must have L — e <f(N) for some N. (This N may depend on e.) If n > N, 
we have f(N) < f(n) since f{n)/ . Hence, we have L — € <f(n) < L for all ft > N, as 
illustrated in Figure 10.3. From these inequalities we find that 

0<L - / (n) < € for all n > N 

and this means that the sequence converges to L, as asserted. 

If fin)\ , the proof is similar, the limit in this case being the greatest lower bound of the 
set of function values. 
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10.4 Exercises 

In Exercises 1 through 22, a sequence {f(n)} is defined by the formula given. In each case, (a) 
determine whether the sequence converges or diverges, and (b) find the limit of each convergent 
sequence. In some cases it maybe helpful to replace the integer n by an arbitrary positive real x 
and to study the resulting function of x by the methods of Chapter 7. You may use formulas (10.9) 
through (10.13) listed at the end of Section 10.2. 


!•/(") = 


2. f(n) = 


n 2 + 1 


17 m y ' + (-2)" 

• J' n> = 3«+i + ( _2)«+i 1 

13. fin) = \‘n + 1 - fi. 


m r 

3. f(n) = COS — . 

« 2 + 3n 

4. f (n) = 


5 - /(») = Jn • 

6. f(n) = 1 + ( -If. 


7. f(n) = 


1 + (-l) n 


*./<«)= + 
9. fin) = 2 1/n , 


14. f(n) 


15. fin) 


where lal < 1. 


,,, 100 , 000/1 


17. fin) 


18. fin) : 

19. f{n ) 

20. fin) 


-'W 

=(>+r 


10. f(n) 


11 - /(«) 


n 2/3 sin (n !) 
n + 1 


21. fin) = - e 

n 

22. fin) = ne~ 


Each of the sequences {a,} in Exercises 23 through 28 is convergent. Therefore, for every pre- 
assigned € > 0, there exists an integer N (depending on e) such that \a n — L\ < e if n > N, where 
L = lim n _, x a n . In each case, determine a value of N that is suitable for each of the following 
values of e: e = 1, 0.1, 0.01, 0.001, 0.0001. 


23. a n = - . 
n 


26. a n = — . 
n\ 



27. a n = 


In 

n 3 + 1 ' 


28. a n =(-!)” 



29. Prove that a sequence cannot converge to two different limits. 

30. Assume a, = 0. Use the definition of limit to prove that lim„^ co cf n = 0. 

31. If lim,,^ a, = A and lim n _, x b n = B. use the definition of limit to prove that we have 
lim^^ (a, + b.) = A + B, and lim^^ (ca,) = cA, where c is a constant. 

32. From the results of Exercises 30 and 31, prove that if lim, ( ., x a, = A then lim n af = A 2 . 
Then use the identity 2 a n b n = (a, + b n ) 2 — a\ — b\ to prove that lim n —^ia n bf) = AB if 
lim n _to a n = A and lim r( _, x b n = B. 
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33. If a is a real number and n a nonnegative integer, the binomial coefficient (“) is defined by 
the equation 


a __ a(a - l)(a — 2) . . . (a — n + 1) 

0 n ” «! 


(a) When a = ~b show that 



(b) Let a n = (— 1)” ( - J/ 2 ), Prove that a, > 0 and that a,+, < a, . 

34. Let / be a real-valued function that is monotonic increasing and bounded on the interval 
[0, 11. Define two sequences {i„} and as follows: 


*= 0 N 7 *= 1 w 


(a) Prove that s n < 




dx < t n and that 0 < 


fix) dx - s n 


. /(I) -/(0) 


(b) Prove that both sequences {sfi and {t„}converge to the limit j 1 f (x)dx. 

(c) State and prove a corresponding result for the interval [a, b], 

35. Use Exercise 34 to establish the following limit relations: 

" Ik 


(a) lim - 
«-><* n 


k= 1 


w JiS2:^TP=>og<i + V2). 

k=l 


71 1 

(b) lim V — — = log2. 

n— >co ft H" k 

k=l 

n 

2 ft IT 

S+P-4- 


( v V s 1 . *ir 2 

(e) lim > - sin — = - . 

ft— >00 ^ ^ ft ft 1 T 

k= 1 

rn V 1 • 2 /:w 1 

(f ) lim > - sim — = - . 

b-» co « n 2 


10.5 Infinite series 

From a given sequence of real or complex numbers, we can always generate a new 
sequence by adding together successive terms. Thus, if the given sequence has the terms 

^1 > ^2 > • • i > ? t • i j 

we may form, in succession, the “partial sums” 

Si = d\ , *^2 = + ^2 j ^3 = + #2 + ^3 r 

and so on, the partial sum of the first n terms being defined as follows: 

n 

s n = a, + fl 2 + . • > + a, = 2 a k . 

fc=i 


(10.14) 
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The sequence { s n } of partial sums is called an infinite series, or simply a series, and is also 
denoted by the following symbols: 

00 

(10.15) a x + a% + a 3 + . . . ; + a 2 + • ■ • + a n + • • ■ , • 

k = J 

For example, the series 1 jk represents the sequence { s, } for which 



The symbols in (10.15) are intended to remind us that the sequence of partial sums { s, } 
is obtained from the sequence {a,} by addition of successive terms. 

If there is a real or complex number S such that 


lim s n = S , 

n-> oo 

we say that the series a > is convergent and has the sum S, in which case we write 


2>* = s. 

k = l 

If { s, } diverges, we say that the series a / diverges and has no sum. 

example 1. the harmonic series. In the discussion of Zeno’s paradox, we showed that 
the partial sums s n of the series 1 jk satisfy the inequality 


n i 

= 2^->. log {n + l) ■ 


k = i 


Since log (n + 1) -*■ oo as n oo, the same is true of s„, and hence the series 1 Ik 
diverges. This series is called the harmonie series. 


example 2. In the discussion of Zeno’s paradox, we also encountered the partial sums 
of the series 1 + J + i + 1 ' • , given by the formula 

t— = 2--^ 

Z 2 k ~ l Y - 1 ’ 

k = 1 

which is easily proved by induction. As n ->■ oo, these partial sums approach the limit 2, 
and hence the series converges and has sum 2. We may indicate this by writing 

(10.16) 1 + £ + i + ’ ' • = '2 . 

The reader should realize that the word “sum” is used here in a very special sense. The 
sum of a convergent series is not obtained by ordinary addition but rather as the limit 
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of the sequence of partial sums. Also, the reader should note that for a convergent series, 
the symbol a k is used to denote both the series and its sum, even though the two are 
conceptually distinct. The sum represents a number and it is not capable of being con- 
vergent or divergent. Once the distinction between a series and its sum has been realized, 
the use of one symbol to represent both should cause no confusion. 

As in the case of finite summation notation, the letter k used in the symbol a, is a 
“dummy index” and may be replaced by any other convenient symbol. The letters n, m, 
and r are commonly used for this purpose. Sometimes it is desirable to start the summation 
from k = 0 or from k = 2 or from some other value of k. Thus, for example, the series 
in (10.16) could be written as Xtlo 1 /2*. In general, ifp > 0, we define the symbol a > 
to mean the same as ^ h k , where b k = . Thus b k = a,, b 2 = a p rl , etc. When there 

is no danger of confusion or when the starting point is unimportant, we write 2 a > instead 

It is easy to prove that the two series JsLi a ’ and a > both converge or both diverge. 
Suppose we let s n = a, + ' . . + a, and t n = a, + a,„ + ' ' 1 + a p+n _ 1 . If p = 0, we 
have t n+1 = a, + s n , so if -*■ S as n —*■ go, then t n —r a„ + S and, conversely, if t — > T 
as n — s ► oo, then s n ~*- T — a 0 . Therefore, both series converge or both diverge whenp = 0. 
The same holds true if p > 1. For p = 1, we have s n = t n , and for p >2, we have 
K = s n T p-i ~ s p-i i and again it follows that the sequences { s, } and {t,} both converge 
or both diverge. This is often described by saying that a finite number of terms may be 
omitted or added at the beginning of a series without affecting its convergence or divergence. 


10.6 The linearity property of convergent series 

Ordinary finite sums have the following important properties: 


n n n 

(10. 17) 2 (a k + b k ) = 2 a k + 2 b k (additive property) 

k=l k= 1 )t=l 

and 

n n 

(10.18) 2( ca Jt) = a k (homogeneous property) . 

i: i k = l 

The next theorem provides a natural extension of these properties to convergent infinite 
series and thereby justifies many algebraic manipulations in which convergent series are 
treated as though they were finite sums. Both additivity and homogeneity may be com- 
bined into one property called linearity which may be described as follows : 


theorem 10.2. Let X a r an d 2 K be convergent infinite series of complex terms and 
let a and fi be complex constants. Then the series 2 (aa„ + j % n ) also converges, and its sum 
is given by the equation 


00 


2(aa„ + pb n ) = 

n=l 




n=l «=1 


(10.19) 
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Proof. Using (10.17) and (10.18), we may write 


(10.20) 20% + flb k ) = a 2% + • 

Jc — 1 fc=l k = 1 

When n — > 00, the first term on the right of (10.20) tends to a 2*1 j a > and the second term 
tends to /? 2*1 1 Therefore the left-hand side tends to their sum, and this proves that 
the series 2 0% + flb k ) converges to the sum indicated by (10.19). 

Theorem 10.2 has an interesting corollary which is often used to establish the divergence 
of a series. 


theorem 10 . 3 . If 2 a > converges and if 2 b n diverges, then 2 (a, + b,) diverges. 

Proof. Since b n = (a, + b,) «■ a, , and since 2 a / converges. Theorem 10.2 tells us that 
convergence of 2 (% + W implies convergence of 2 b„ . Therefore, 2 ( a > + W cannot 
converge if 2 b n diverges. 

example . The series 2 (1 /£ + 1 / 2 s ) diverges because 2 1 /£ diverges and 2 1 / 2* converges. 

If 2 a / and 2 am both divergent, the series 2 (% + W may or may not converge. For 
example, when a, = b n = 1 for all n, then 2 ( a > + b n ) diverges. But when a, = 1 and 
b n = -1 for all n, then 2 ( a > + w converges. 

10.7 T elescoping series 

Another important property of finite sums is the telescoping property which states that 

(10.21) I(b k -b k+1 )=b 1 ~b n+1 . 

k =-- 1 

When we try to extend this property to infinite series we are led to consider those series 
2 a / for which each term a, may be expressed as a difference of the form 

(10.22) 3, = b n - b n+ 1 . 

These series are known as telescoping series and their behavior is characterized by the 
following theorem. 

THEOREM 10.4. L et {a n } and {b,}be two sequences of complex numbers such that 

(10.23) a, = b n — b n+1 for n = 1, 2, 3, 

Then the series 2 3; converges if and only if the sequence fa,} converges, in which case we have 

oo 

2 u n = bi — L , where L = lim b n . 

7i — i n~* oc 


(10.24) 
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Proof. Let s n denote the nth partial sum of J a, . Then we have 


n n 

s n —'£ a k — HL(bk — b k+1 ) - hj — b n+1 , 

k-1 A:— 1 

because of (10.21). Therefore, both sequences {i n } and (b.j converge or both diverge. 
Moreover, if b n — >■ L as ri —*■ oo, then s n — >• b± — L, and this proves (10.24). 

Note: Every series is telescoping because we can always satisfy (10.22) if we first choose 

b x to be arbitrary and then choose b r+1 = b x — s n for n > 1, where s n = a, + . + a,. 

example 1. Let a, — l/(« 2 + n). Then we have 

_J 1 1 

" n(n + '1) n n + 1’ 

and hence (10.23) holds with b n = \jn. Since £>i = l and L = 0, we obtain 


2 
n= 1 


1 

n(n + 1) 




example 2. If x is not a negative integer, we have the decomposition 


(n + x)(n + x + l)(/i + x + 2) 2 \(n + x)(n + x + 1) (n + x + l)(n + x + 2)/ 

for each integer n > 1. Therefore, by the telescoping property, the following series conver- 
ges and has the sum indicated: 

I i 

(n + x^(ji + x + 1)(» + x + 2) 2(x + l)(^c + 2) 

71 = 1 

example 3. Sincelog [»/(« + 1)] = log n — log (n + 1), and sincelog n — > cc as n — > oo, 
the series 2 lo S [«/(« + 1)] diverges. 

Note: Telescoping series illustrate an important difference between finite sums and 

infinite series. If we write (10.21) in extended form, it becomes 


(Pi - b 2 ) + (£> 2 - £> 3 ) + • " + (b n - b n+ 1 ) = £>! - b n+1 

which can be verified by merely removing parentheses and canceling. Suppose now we per- 

form the same operations on the infinite series 

(£>i “ £>2) + (£>2 “ £>3) + (£>3 — b, ) + . 1 • . 

We leave £> x , cancel b 2 , cancel b 3 , and so on. For each n > 1, at some stage we cancel b n . 

Thus every b n cancels with the exception of £>j . This leads us to the conclusion that the sum 
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of the series is b j . Because of Theorem 10.4, this conclusion is false unless linv^co b n = 0, 
This shows that parentheses cannot always be removed in an infinite series as they can in a 
finite sum. (See also Exercise 24 in Section 10.9.) 


10.8 The geometric series 

The telescoping property of finite sums may be used to study a very important example 
known as the geometric series. This series is generated by successive addition of the terms 
in a geometric progression and has the form 2 X", where the nth term x n is the nth power 
of a fixed real or complex number x. It is convenient t° start this series with n = 0, with 
the understanding that the initial term, x°, is equal to 1. 

Let s n denote the nth partial sum of this series, so that 

S n □ a 1 i m c ■ X 2 ■ * & m X U ~^. 

If x = 1, each term on the righ t is 1 and s n = n. In this case, the series diverges since 
s n — >■ co as n — > co. If x 5 ^ 1, we may simplify the sum for s n by writing 

(1 - x)s n = (1 x)£x ft = 2 (x k - x k+1 ) = 1 - x” , 

*=0 

since the last sum telescopes. Dividing by 1 — x, we obtain the formula 


s;; = 


1-x” 

1 -X 



if x t* 1 . 


This shows that the behavior of s n for large n depends entirely on the behavior of x n . 
When |x| < l, then x n -> 0 as n CO, and the series converges to the sum 1/(1 — x). 

Since $ + — s n = X n , convergence of {s n } implies x n —> 0 as n — >• oo, Therefore, if 
|x| > 1 the sequence {s n } diverges since x n does not tend to 0 in this case. Thus we have 
proved the following theorem. 


theorem 10.5. Zf x is complex, with \x\ < 1, the geometric series 2“ =0 x” converges 
and has sum 1/(1 — x). That is to say, we have 

(10.25) l+x+ x 2 + -- + X n + ■ • ■ = — if | x | < 1. 

1 - x 

Zf |x| > 1, the series diverges. 


The geometric series, with |.v| < 1, is one of those rare examples whose sum we are 
able to determine by finding first a simple formula for its partial sums. (A special case 
with x = -1 was encountered in Section 10.1 in connection with Zeno’s paradox.) The 
real importance of this series lies in the fact that it may be used as a starting point for 
determining the sums of a large number of other interesting series. For example, if we 
assume |x| < 1 and replace x by x 2 in (10.25), we obtain the formula 


1 + x 2 + x 4 + ■ ■ • + x 2 ” + • • 


_L .. 


(10.26) 


if |x| < 1 . 
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Notice that this series contains those terms of (10.25) with even exponents. Tt> find the 
sum of the odd powers alone, we need only multiply both sides of (10.26) by x to obtain 

(10.27) X + X 3 + X 5 + ■ ■ ■ + X 2n+1 + " ' = H if |x| < 1 • 

1 — x 2 

If we replace x by — x in (10.25), we find that 

(10.28) 1 — x + X 2 — X 3 + • • • + ( — l)"x m + • • • = yi- if |x| < 1 . 

Replacing x by x 2 in (10.28), we find that 

(10.29) l-x 2 + x 4 -x° + --- + (-irx 2?, + ••• = —!— if |x| < 1 • 

1 + x 

Multiplying both sides of (10.29) by x, we obtain 

(10.30) x - x 3 + X 6 - x’ + • . • + (- l)”x 2n+1 + ■ . . = — , if |x| < 1 . 

1 + X 2 

If we replace x by 2x in (10.26), we find that 

1 + 4x 2 + 16x 4 + • ■ • + 4"x 2 " + • ' ' = 1 — ", , 

1 - 4x 2 

which is valid if 1 2.x j < 1 or, what is the same thing, if |x| < It is clear that many other 
examples may be constructed by similar means. 

All these series have the special form 

CO 

and are known as power series. The numbers 3, , 3, , a 2 , • • • , which may be real or complex, 
are called coefficients of the power series. The geometric series is an example with all 
coefficients equal to 1 . If x and all the coefficients are real, the series is called a real power 
series. We shall find later, when we discuss the general theory of real power series, that it 
is permissible to differentiate and to integrate both sides of each of the Equations (10.25) 
through (10.30), treating the left-hand members as though they were ordinary finite sums. 
These operations lead to many remarkable new formulas. For example, differentiation of 
(10.25) gives us 


(10.31) 


1 + 2x + 3x 2 + . . . + nx" 1 + . . . 


1 

(1 - xf 


if |x| < 1 , 
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whereas integration of (10.28) yields the interesting formula 


(10.32) 



log(l + x) 


which expresses the logarithm as a power series. This is the discovery of Mercator and 
Brouncker (1668) that we mentioned earlier. Although each of the Equations (10.25) 
through (10.31) is valid for x in the open interval -1 < x < +1, it turns out that the 
logarithmic series in (10.32) is valid at the endpoint x = + 1 as well. 

Another important example, which may be obtained by integration of (10.29), is the 
following power-series expansion for the inverse tangent, discovered in 1671 by James 
Gregory (1638-1 675) : 


v 3 y 5 v 7 ( i\n v 2n+l 

(10.33) x 1 — f • • ■ + - — — )-••• = arctan x . 

357 2n + 1 

Gregory’s series converges for each complex x with |x| < 1 and also for x = ± 1. When 
x is real, the series agrees with the inverse tangent function introduced in Chapter 6. The 
series can be used to extend the definition of the arctangent function from real values of x 
to complex x with \x\ < 1. 

Many of the other elementary functions of calculus, such as the sine, cosine, and expo- 
nential, may also be represented by power series. This is not too surprising, in view of 
Taylor’s formula which tells us that any function may be approximated by a Taylor poly- 
nomial in x of degree < n if it has derivatives of order n + 1 in some neighborhood of the 
origin. In the examples given above, the partial sums of the power series are precisely the 
Taylor polynomials. When a function f has derivatives of every order in a neighborhood 
of the origin, then for every positive integer n Taylor’s formula leads to an equation of the 
form 

(10.34) f(x) = ^ a k x k + E,(x) , 

k=0 

where the finite sum a k x k is a Taylor polynomial of degree < n and E,(x) is the error 
for this approximation. If, now, we keep x fixed and let n increase without bound in (10.34) 
the Taylor polynomials give rise to a power series, namely a k xk > where each coefficient 
a k is determined as follows: 

_ fi k) ( , 0 ) 
k\ 

If, for some x, the error E n (x) tends to 0 as n — > oo, then for this x we may let n — > oc in 
(10.34) to obtain 

f(x) = lim ^a k x k + lim E,(x) = ^a k x k 

n~> x k=Q n-*c o 7c=Q 

In other words, the power series in question converges to fix). If x is a point for which 
E,(x) does not tend to 0 as n — > 00, then the partial sums will not approach/(x). Conditions 
on f for guaranteeing that E n {x) ->0 will be discussed later in Section 1 J , 10. 
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To lay a better foundation for the general theory of power series, we turn next to certain 
general questions related to the convergence and divergence of arbitrary series. We shall 
return to the subject of power series in Chapter 11. 


10.9 Exercises 


Each of the series in Exercises 1 through 10 is a telescoping series, or a geometric series, or some 
related series whose partial sums may be simplified. In each case, prove that the series converges 
and has the sum indicated. 


1. 


(2n - l)(2fl + 1) = 2 ' 
2 


4 2 


n= 

CC 


2 2^.- 3 ' 
n= 1 

V 1 3 

3 - "4 


(n + 1)(h + 2)(n + 3) = 4 

2n + 1 
nHn + l) 2 


1. 


V 2" + « 2 + n 

8. > —n — = 1. 


£.4 2 n +y« + 1) 


‘■2 


2 n + 3 n 


(-1 ) n '\2n + 1) 


n = 1 
oo 


s ^ Vn + 1 - Vn _ 
IS Vn 2 +n 


9 -2 

n—1 

io. 

Z-4 (log n 


= 1 


n(n + 1) 

+ !//;)"(! + «)] 


(log « n )[log (n + 1)" +1 ] = 


log 2 Ve. 


Power series for log (1 + x) and arctan x were obtained in Section 10.8 by performing various 
operations on the geometric series. In a similar manner, without attempting to justify the steps, 
obtain the formulas in Exercises 1 1 through 19. They are all valid at least for |x| < 1. (The theo- 
retical justification is provided in Section 11.8.) 


XT' x 2 "" 1 1 + a- 

14 2— =* lo s— • 


it. 


nx = 


1 + x 


n=l 

oo 


12 , 


;. y^n 2 x n 

n= 1 
00 

13. y n 3 x n 


(1 - *)* • 
x% 4* X 

X s + 4x2 + x 


n = 1 
C0 


17. y (n + l)x” 


1 


n= 1 
oo 


14. yn 4 x n = ■ 


( 1-x ) 4 ■ 

+ llx 3 + 11x2 + x 


oo 


18 -2 

n = 0 
oo 


(1 - Jc) 


,2 ’ 


\^(n + 1 )(/! + 2) 


2 ! 


(1 - xf ' 


n~\ 

oo 


(1 - xf 


19 y (« + 1)(« + 2 ){n + 3) 

«= o 


(1 -x) 4 1 


15. 


x n , 1 

7-‘° 6 T^ 


20. The results of Exercises 11 through 14 suggest that there exists a general formula of the form 

Pk(x) 


~y n k x n 


n = 1 


(1 - x) k+1 ’ 


where P k (x) is a polynomial of degree k, the term of lowest degree being x and that of highest 
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degree being Prove this by induction, without attempting to justify the formal manipula- 
tions with the series. 

21. The results of Exercises 17 through 19 suggest the more general formula 



(1 _ *)*=+! ’ 


( n + k \ (« + 1 )(n + 2) " . (« + k) 

[ k ) k 7 


Prove this by induction, without attempting to justify the formal manipulations with the 
series. 

22. Given that xn l n • = f° r all x > ^nd the sums of the following series, assuming it is 
permissible to operate on infinite series as though they were finite sums. 


2 n — 1 v' n + 1 

— r . (b)> — 

n ! ■*— f n ! 

n~ 2 n= 2 

23. (a) Given that 2^=0 x n jn\ = e x for all x, show that 




in - l)(n + 1) 


n= 2 


n\ 


n\ 


= (x 2 + x)e x , 


assuming it is permissible to operate on these scries as though they were finite sums. 

(b) The sum of the series is ke, where A: i s a positive integer. Find the value of k. 

Do not attempt to justify formal manipulations. 

24. Two series ^”=1 a ' all( l Jn=i are called identical if a, = b n for each // > 1. For example, 
the series 

0 + 0 + 0 + • • • and (1 - 1) + (1 - 1) + (1 - 1) + . . . 
are identical, but the series 

1+1 +!+••■ and 1 + 0 + 1 + 0 + 1 + 0 + -- 


are not identical. Determine whether or not the series are identical in each of the following 
pairs : 

(a) 1 _ 1 + 1 _ 1 + . . 

and 

(2 -1)- (3 -2) +(4-3) -(5 — 4) + 

(b) 1 - 1 + 1 - 1 + . . . 

and 

(1 _!)+(! _1) + (1 _ 1) + ( 1 . 

( C ) 1 _ 1 + 1 _ 1 + . . 

and 

1 +(-l + 1) + ( — 1 + 1) + ( — 1 + 1) + • • • . 

(d) l + 2 + i + 4 + • • 

and 

1 + (1 - i) + (i - i) + (i - i) + - • • • 


25. (a) Use (10.26) to prove that 


1 + 0+x: 2 +0+x 4 + 


1 — x 2 


if 


< 1 


Note that, according to the definition given in Exercise 24, this series is not identical to the 
one in (10.26) if x ^ 0. 

(b) Apply Theorem 10.2 to the result in part (a) and to (10.25) to deduce (10.27). 

(c) Show that Theorem 10.2 when applied directly to (10.25) and (10.26) does not yield (10.27). 
Instead, it yields the formula 2”=i (■*" — * 2n ) = x l( 1 — x 2 ), valid for |x| < 1. 
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*10.10 Exercises on decimal expansions 

Decimal representations of real numbers were introduced in Section 13.15. It was shown there 
that every positive real x has a decimal representation of the form 

x = a 0 . aia 2 a 3 . . . , 

where 0 < a k < 9 for each k > 1. The number x is related to the digits , flj , , . . . by the 

inequalities 


(10.35) 


«l 


a ' + m + - ' + 7^* <a «+77i + "‘ + + 


@n— 1 O n T 1 


10 10 " _ u 10 10"- 1 ' 10 n 
If we let s n =2^ q a kl 10 fc > and if we subtract s n from each member of (10.35), we obtain 

0 < x s n < 10- ,! . 

This shows that s n — > x as n — > co, and hence x is given by the convergent series 


(10.36) 


Z Qk 

10* ' 


Each of the infinite decimal expansions in Exercises 1 through 5 is understood to be repeated 
indefinitely as suggested. In each case, express the decimal as an infinite series, find the sum of the 
Series, and thereby express x as a quotient of two integers. 

1. x=0.4444.... 4. x = 0.123123123123.. . . 

2. x = 0.51515151 .... 5. x = 0.142857142857142857142857.. . . 

3. x = 2.02020202. . . . 

6. Prove that every repeating decimal represents a rational number. 

7. If a number has a decimal expansion which ends in zeros, such as | = 0.1250000. . . , then 
this number can also be written as a decimal which ends in nines if we decrease the last nonzero 
digit by one unit. For example, J = 0.1249999 .... Use infinite series to prove this statement. 

The decimal representation in (10.36) may be generalized by replacing the integer 10 by any 
other integer b > 1 . If x > 0, let a 0 denote the greatest integer in x; assuming that , flj , , • , a n-l 
have been defined, let a. denote the largest integer such that 



X . 


The following exercises refer to the sequence of integers a, , a, , a 2 , . . . so obtained. 

8. Show that 0 < a k < b — 1 for each k > 1. 

9. Describe a geometric method for obtaining the numbers , a, , a 2 , • • • • 

10. Show that the series^^o^fc/^ converges and has sum x. This provides a decimal expansion 
of x in the scale of b. Important special cases, other than b = 10, are the binary scale, b = 2, 
and the duodecimal seule, b = 12. 




394 


Sequences, infinite series, improper integrals 


10.11 Tests for convergence 

In theory, the convergence or (divergence of a particular series 2 a, is decided by examin- 
ing its partial sums s n to see whether or not they tend to a finite limit as n — r oo. In some 
special cases, such as the geometric series, the sums defining j may be simplified to the 
point where it becomes a simple matter to determine their behavior for large n, However, 
in the majority of cases there is no nice formula for simplifying s n and the convergence 
or divergence may be rather difficult to establish in a straightforward manner. Early 
investigators in the subject, notably Cauchy and his contemporaries, realized this difficulty 
and they developed a number of “convergence tests” that by-passed the need for an explicit 
knowledge of the partial sums. A few of the simplest and most useful of these tests will 
be discussed in this chapter, but first we want to make some general remarks about the nature 
of these tests. 

Convergence tests may be broadly classified into three categories: (i) sufficient conditions; 
(ii) necessary conditions; (iii) necessary and sufficient conditions. A test of type (i) may 
be expressed symbolically as follows: 

“If C is satisfied, then J a, converges,” 
where C stands for the condition in question. Tests of type (ii) have the form 
“If ^ a n converges, then C is satisfied,” 
whereas those of type (iii) may be written thus: 

a, converges if and only if C is satisfied.” 

We shall see presently that there are tests of type (ii) that are not of type (i) (and vice versa). 
Beginners often use such tests incorrectly by failing to realize the difference between a 
necessary condition and a sufficient condition. Therefore the reader should make an effort 
to keep this distinction in mind when using a particular test in practice. 

The simplest of all convergence tests gives a necessary condition for convergence and 
may be stated as follows. 

theorem 10.6. If the series J a > converges, then its nth term tends to 0; that is, 

(10.37) lim a, = 0 . 

n->cc 

Proof. Let s„ = a, + fl 2 + • • • + a, . Then a, = s n <— s n _ x .As n — > oo, both s„ and 
s n _i tend to the same limit and hence a, — > 0. This proves the theorem. 

This is an example of a test of type (ii) which is not of type (i). Condition (10.37) is not 
sufficient for convergence. For example, when a, = 1 jn, the condition a, — > 0 is satisfied 
but the series 2 l/« diverges. The real usefulness of this test is that it gives us a sufficient 
condition for divergence. That is, if the terms a, of a series ^ a, do not tend to zero, then 
the series must diverge. This statement is logically equivalent to Theorem 10.6. 

10.12 Comparison tests for series of nonnegative terms 

In this section we shall be concerned with series having nonnegative terms, that is, series 
of the form a„ where each a ? , > 0. Since the partial sums of such series are monotonic 




Comparison tests for series of nonnegatice terms 


395 


increasing, we may use Theorem 10.1 to obtain the following necessary and sufficient 
condition for convergence. 


theorem 10.7. Assume that a, > 0 for each n > 1. Then the series converges 
if and only if the sequence of its partial sums is bounded above. 


If the partial sums are bounded above by a number M, say, then the sum of the series 
cannot exceed M. 

example 1. Theorem 10.7 maybe used to establish the convergence of the series 1 /« !. 
We estimate the partial sums from above by using the inequality 

— < — . 
k\ ~ 2 &_1 

which is obviously true for all k > 1 since k! consists of k — 1 factors, each >2. Therefore 
we have 



k=l ' k = 1 k=0 k = 0 


the last series being a geometric series. The series 1/^! therefore convergent and 
has a sum < 2. We shall see later that the sum of this series is e — 1, where e is the Euler 
number. 

The convergence of the foregoing example was established by comparing the terms of 
the given series with those of a series known to converge. This idea may be pursued further 
to yield a number of tests known as comparison tests. 


theorem 10 . 8 . comparison test. Assume a, > 0 and b r > 0 for ail n > 1 . If there 
exists a positive constant c such that 

(10.38) a n < cb n 

for all n, then convergence of 2 b n implies convergence of 2 a,. 

Note: The conclusion may also be formulated as follows: "Divergence of ^ a > implies 

divergence of ^ b n .” This statement is logically equivalent to Theorem 10.8. When the 
inequality (10.38) is satisfied, we say that the series ^ b n dominates the series ^ a, . 

Proof. Let s n = a, + ... + a,, t n = b x + ' ' ' + b n . Then (10.38) implies s n < ct, . If 
^ b n converges, its partial sums are bounded, say by M. Then s n < cM, and hence J a, 
is also convergent since its partial sums are bounded by cM. This completes the proof. 
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Omitting a finite number of terms at the beginning of a series does not affect its con- 
vergence or divergence. Therefore Theorem 10.8 still holds true if the inequality (10.38) is 
valid only for all n > N for some N. 


THEOREM 10.9. limit compari i son test. Assume that a,„ > 0 and b n > 0 for all n > l, 

and suppose that 

(10.39) lim ~ = 1 • 

Then 2 a, converges if and only if 2 b n converges. 

Proof. There exists an N such that n > N implies \ < ajb n < f. Therefore b n < 2 a n 
and a, < | b n for all n > N, and the theorem follows by applying Theorem 10.8 twice. 

Note that Theorem 10.9 also holds if \im n _^ oa a n lb n = c, provided that c > 0, because 
we then have \im n ^ w a n l(cb n ) = 1 and we may compare 2 a > with 2 (cb,). However, if 
^ m n^oo a n/b n = 0, we conclude only that convergence of 2 b n implies convergence of 2 a / ■ 

definition. Two sequences {a,,} and fa,} of complex numbers are said to be asymptotically 
equal if 

lim — = 1 . 
n-*co 

This relation is often indicated symbolically by writing 

(10.40) a, b n as n-y od . 

The notation a, rv b n is read is asymptotically equal to b n and it is intended to 
suggest that a, and b n behave in essentially the same way for large n. Using this terminology, 
we may State the limit comparison test in the following manner. 


theorem 10.10. Two series 2 a / and 2 b n with terms that are positive and asymptotically 
equal converge together or they diverge together. 


example 2. the riemann zETA — function . In Example 1 of Section 10.7, we proved 
that 2 I/O 2 +u)is a convergent telescoping series. If we use this as a comparison series, 
it follows that 2 1/u 2 i s convergent, since 1 jn 2 ~ I/O 2 + n) as n — > ao. Also, 2 1 /n 2 
dominates 2 1 0 s for ,s > 2, and therefore 2 1 0 s converges for every real s > 2. We shall 
prove in the next section that this series also converges for every j > 1. Its sum, denoted 
by (f is the Greek letter zeta), defines an important function in analysis known as the 
Riemann zeta- function: 


as) = y 


1 


n s 


if s > 1 . 


Euler discovered many beautiful formulas involving l(s). In particular, he found that £(2) = 
tt 2 [6, a result which is not easy to derive at this stage. 
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example 3. Since ^ 1 fn diverges, every series having positive terms asymptotically 
equal to l/n must also diverge. For example, this is true of the two series 

to, . co l 

V 7 = and "S' sin - . 

V n(n + 10) 4_, n 

n = l n=\ 

The relation sin l/n r^i l/n follows from the fact that (sin x)/x I as x — > 0. 

10.13 The integral test 

To use comparison tests effectively, we must have at our disposal some examples of 
series of known behavior. The geometric series and the zeta-function are useful for this 
purpose. New examples can be obtained very simply by applying the integral test, first 
proved by Cauchy in 1837. 




Figure 10.4 Proof of the integral test. 

theorem 10,11. integral test. Let f be a positive decreasing function, defined for 
all real x > 1 . For each n>\, let 


s»=2f(k) and t n = (f(x) dx . 
k=l J i 

Then both sequences &} and ft} converge or both diverge. 

Proof. By comparing f with appropriate step functions as suggested in Hguie 10.4, we 
obtain the inequalities 

im < f 7(x) dx <”£. m 

k—2 J 1 fc=l 

or s n -f(l) < t n < . Since both sequences { s, } and {t n } are monotonic increasing, 

these inequalities show that both are bounded above or both are unbounded. Therefore, 
both sequences conveige or both diverge, as asserted. 
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example 1. The integral test enables us to prove that 


— converges if and only if S > 1 ■ 
n s 

n — 1 

Taking /(x) = X s , we have 

n 1-5 — 1 ... , 1 

if S 7* 1 , 

1 - s 

log n i f 5 = 1 . 

When j > 1 the term n 1_s — * 0 as n — > go and hence {?„} converges. By the integral test, 
this implies convergence of the series for 5 > 1. 

When s < 1 , then t„ -> oo and the series diverges. The special case 5 = 1 (the harmonic 
series ) was discussed earlier in Section 10.5. Its divergence was known to Leibniz. 




example 2. The same method may be used to prove that 


2 


_j 

n(log n)“ 


converges if and only if s > 1 . 


(We start the sum with n = 2 to avoid n for which log n may be zero.) 
The corresponding integral in this case is 


t 


n 


' n 1 

J 2 x(log x) 


dx = 


(log n f~ s - (log 2) 1 ~ s 
1 - s 

.log (log n) - log (log 2) 


if S 5 * 1 , 
if S = 1 . 


Thus {t n } converges if and only if s > 1, and hence, by the integral test, the same holds 
true for the series in question. 


10.14 Exercises 

Test the following series for convergence or divergence. In each case, give a reason for your 
decision, 


UJ 

2 (4« _ 


% (4 n - 3)(4« - l) . 


(I 


„ x' V2n - 1 log (An + 1) ( 

' 2-t n(n + 1) 

n=i 

2 n ’ ^ 

n-l 

00 2 

V n 

n=l 


, V l sin ny \ - 
5 - Z n 2 • 

n = 1 

CO 

4 2 


n = 1 
00 


2 +(-l) n 

i! 


.6 


2 k! 

(» + 2)! ' ° 

yfiLi=,c 

±1, n Vn + 1 


ft =2 


7. 
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10 . 


V* 1 + Vn 


i 

00 

14 2- 

n = 1 
00 

* 

T'C 

15 '2n 


n cos 3 («tt/3) 


11 


n = 1 
00 

•2 


log n (log log n) s 


co 


,k (l0 Z nY ' 

“ i i 

12. » kl < 10. C 


16. / ne 

«=i 
oo 


W~1 

CO 


13. 


' 2 1000« + 1 ' ^ 


W=1 


• 2 - 

v p/" V* . 

• 2 Jo TTT^' 

n = l 
oo 

is. ^ \l +1 e V ~ xdx ‘ 

«= i 


19. Assumefis a nonnegative increasing function defined for all x > 1. Use the method suggested 
by the proof of the integral test to show that 


■n — i it 

2/w - |"/w dx . 


fe=i 


fc =2 


Take/fx) = log x and deduce the inequalities 

(10.41) e n n e~ n < n ! <en n+l e~ n . 

These give a rough estimate of the order of magnitude of ni. From (10.41), we may write 


£ l/n ( n !)l/n e l/n n l/n 

e ^ < e 


Letting n -> oo, we find that 

(n!) 1/n 1 


- or (n !) 1/n <■ — as n ->■ oo . 

n e e 


10.15 The root test and the ratio test for series of nonnegative terms 

Using the geometric series J x n as a comparison series, Cauchy developed two useful 
tests known as the root test and the ratio test. 

If ^ a, is a series whose terms (from some point on) satisfy an inequality of the form 

(10.42) 0 < a, < x n , where 0 < x < 1 , 

a direct application of the comparison test (Theorem 10.8) tells us that 2 a » converges. 
The inequalities in (10.42) are equivalent to 

(10.43) 0 <a 1 J n <x ; 


hence the name root test. 
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If the sequence {u l J n j is convergent, the test may be restated in a somewhat more useful 
form that makes no reference to the number x. 


theorem 10.12. root test. L et J 3/ be a series of nonnegative terms such that 

al ln -*R as n -* oo . 

(a) I f R < 1, the series converges. 

(b) IfR > 1, the series diverges. 

(c) If R = 1, the test is inconclusive. 


Proof. Assume R < 1 and choose x SO that R < x < 1. Then (10.43) must be satisfied 
for all n > N for some N. Hence, ^ a > converges by the comparison test. This proves (a). 

To prove (b), we observe that R > 1 implies a, > 1 for infinitely many values of n 
and hence a, cannot tend to 0. Therefore, by Theorem 10.6, 2 a > diverges. This proves (b). 

To prove (c), consider the two examples in which a, = 1/n and a, = 1 jn 2 . In both 
cases R = 1 since /j 1 /" — ► 1 as n —r oo [see Equation (10.12) of Section 10.21, but J 1 l n 
diverges whereas ^ 1 jn 2 converges. 


example 1. The root test makes it easy to determine the convergence of the series 

2“, (i °s n y n since 


aVn . 

log n 


+0 


00 


example 2. Applying the root test to ^ [n/(n + l)]" 2 , we find that 


a 


1 jn 
n 


Vn + i' (i * u«r c 


as 


n -> w, 


by Equation (10.13) of Section 10.2. Since 1 je < 1, the series converges. 


A slightly different use of the comparison test yields the ratio test. 


theorem 10. 13. ratio test . Let ^ a, be a series of positive terms such that 

® n+i + L as n -*■ oo . 

(a) If L< 1, the series converges. 

(b) If L > 1, the series diverges. 

(c) if L = 1, the test is inconclusive. 

Proof. Assume L < 1 and choose x so that L < x < 1. Then there must be an N 
such that a n+ ild n < x for all n > N. This implies 


x n+1 ^ x n 


for all n > N . 
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In other words, the sequence { ajx n } is decreasing for n > N. In particular, when n > N, 
we must have a n jx n < a N jx s , or, in other words, 

a, < cx n , where c = . 

x 

Therefore a, is dominated by the convergent series 2 This proves (a). 

To prove (b), we simply observe that L > 1 implies a,,, > a, for all tl > N for some N, 
and hence a, cannot approach 0. 

Finally, (c) is proved by using the same examples as in Theorem 10.12. 

Warning. If the test ratio a,,, /a, is always less than 1, it does not necessarily follow that 
the Zimit L will be less than 1. For example, the harmonic series, which diverges, has test 
ratio nl(n + 1) which is always less than 1 but the limit L equals 1. On the other hand, for 
divergence it is sufficient that the test ratio be greater than 1 for all sufficiently large n 
because for such tl we have a,,, > a, and a, cannot approach 0. 

example 3. We may establish the convergence of the series 2 by the ratio test. 
The ratio of consecutive terms is 

a„.i _ (n + 1)! , _ / I n 

a n (n + 1) K+1 n! \/j + 

by formula (10.13) of Section 10.2. Since 1/e < 1, the series converges. In particular, this 
implies that the general term of the series tends to 0; that is, 


if- 


(1 + 1/n)* 


as n - 0 0 , 


«.! 

(10.44) — *■ 0 as n -> CO . 

n n 

This is often described by saying that n n “grows faster” than n! for large n. Also, with a 
natural extension of the o-notation, we can write (10.44) as follows: n! = o(n n ) as tl — > oo. 

Note: The relation (10.44) may also be proved directly by writing 

n! 12 k k + l tl 
n n n n n n n' 


where k = nj 2 if n is even, and k=(n — 1)/2 if n is odd. If n > 2, the product of the first 
k factors on the right does not exceed ) k , and each of the remaining factors does not 
exceed 1. Since (s)* 0 as n ->■ m, this proves (10.44). Relation (10.44) also follows 

from (10.41). 

The reader should realize that both the root test and the ratio test are, in reality, special 
cases of the comparison test. In both tests when we have case (a), convergence is deduced 
from the fact that the series in question can be dominated by a suitable geometric series 
^ x n . The usefulness of these tests in practice is that a knowledge of a particular comparison 
series 2 x>> is not explicitly required. Further convergence tests may be deduced by using 
the comparison test in other ways. Two important examples known as Raabe’s test and 
Gauss’ test are described in Exercises 16 and 17 of Section 10.16. These are often helpful 
when the ratio test fails, 
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10.16 Exercises 

Test the following series for convergence or divergence and give a reason for your decision in 


each 


^M 2 

■ Z, (2 »)! ' 


n — 1 


(nfi 

2 n% 


■2 

n = l 

y2 n n\ 

' 2-4 n n 
n = l 

00 

2 3 n n\ 
~rF ' 


10 . 


' 11 . 


uu 

. 2(« 1/n - 1)“. 

«= 1 

n=l 

2(1 - 


W=1 
00 , 
n! 

3 n 

w=i 

^ «/ 

6 - ^ 2 2 " 
n=l 
00 

„ V 


n —1 

v (1000)” 

n! ■ 

«=1 

00 .n+l/n 


12 


13 


2 n" 

i (« + l/«) n ' 

^ « :! [V2 + (-1)"]" 


«=1 

00 


3" 


Z, (log n) <!r 


14. ^ r n sin nx\, r > 0. 

n=i 

15. Let {a,} and { b, } be two sequences with a, > 0 and b n > 0 for all n > N, and let c n — b n ~ 

b n +l a n+ll a n • Prove that : 

(a) If there is a positive constant r such that c„ > r > 0 for all n > N, then ^ a, converges. 
[Hint: Show that 2?=A T a k %h v /r.] 

(b) If c„<0 for n > N and if ^ 1 fb n diverges, then ^ a, diverges, 

[Hint: Show that ^ a n dominates ^1 jb n ■] 

16. Let ^ a, be a series of positive terms. Prove Raabe ’s test: If there is an r > 0 and an N > 1 
such that 

1 1 r 

<1 for all n > N , 

a n n n ~ 

then ^ a, converges. The series ^ a n diverges if 

1 


^>1 -- 
a„ n 


for all n > N 


[Hint: Use Exercise 15 withh„ +1 = n.[ 


17. Let ^ a, be a series of positive terms. Prove Gauss’ test: If there is an N > 1, an s > 1, and 
an M > 0 such that 


«*+i , A , /(») , . 

= 1 h — r for n > N , 

a n n n s ~ 


where | f {ri)\KM for all W, then ^ a,, converges if A > 1 and diverges if A < 1. 

[Hint: If A ^ 1, use Exercise 16. If A = 1, use Exercise 15 with b n+1 = n log «.] 
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18. Use Gauss’ test (in Exercise 17) to prove that the series 


/I -3.5.. 

, (2n- Iff 

[ 2-4-6 

■■(2 n) ) 


converges if k > 2 and diverges if k < 2. For this example the ratio test fails. 


10.17 Alternating series 

Up to now we have been concerned largely with series of nonnegative terms. We wish 
to turn our attention next to series whose terms may be positive or negative. The simplest 
examples occur when the terms alternate in sign. These are called alternating series and 
they have the form 


(10.45) i(- 1)"^ = a, - fl 2 + a 3 - a 4 + . . . + (-1)" \ + ' 1 . . 

n— 1 


where each a, > 0. 

Examples of alternating series were known to many early investigators. We have already 
mentioned the logarithmic series 

Y 2 Y S Y 4 Y n 

log (i + x) =x - 1 + 1 - j + ■ ■ • + (-i r 1 j + ■ • - , 

As we shall prove later on, this series converges and has the sum log (1 + x) whenever 
— 1 < x < 1. For positive x, it is an alternating series. In particular, when x = 1 we 
obtain the formula 


,10.46, 1082- 1 — 1 + 1 — i + ... + fchU + . .. 

which fells us that the alternating harmonic series has the sum log 2. This result is of 
special interest in view of the fact that the harmonic series £ 1/n diverges. 

Closely related to (10.46) is the interesting formula 

7T . 11 1 , (-1)"- 1 , 

(10.47) 4 = l~3 + 5 _ 7 + '" + + ” * 

discovered by James Gregory in 1671. Leibniz rediscovered this result in 1673 while 
computing the area of a unit circular disk. 

Both series in (10.46) and in (10.47) are alternating series of the form (10.45) in which the 
sequence fa,} decreases monotonically to zero. Leibniz noticed, in 1705, that this simple 
property of the a, implies the convergence of any alternating series. 
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theorem 10.14. Leibniz’s rule. If {a,}is a monotonic decreasing sequence with limit 
0, then the alternating series (— l) n-1 <z„ converges. If S denotes its sum and s n its nth 
partial sum, we also have the inequalities 

(10.48) 0 < (-1)"(5 - s„) < a n+1 for each // > I . 

The inequalities in (10.48) provide a useful way to estimate the error in approximating 
the sum S by any partial sum s,. The first inequality tells us that the error, S — s n , has 
the sign (- l) n , which is the same as the sign of the first neglected term, (- 1 ) n a n+1 . The 
second inequality states that the absolute value of this error is less than that of the first 
neglected term. 

s„, n even s„. n odd 


^2 ^4 


s 2n S 2 „ | ... j. S:i 

-? • • *- 


C'in 


S l 


figure 10.5 Proof of Leibniz’s rule for alternating series. 

Proof. The idea of the proof of Leibniz’s rule is quite simple and is illustrated in Figure 
10.5. The partial sums s 2n (consisting of an even number of terms) form an increasing 
sequence because S 2n+2 — s 2n = Cl in+y — a 2n+2 > 0. Similarly, the partial sums j 2 „_ 1 form 
a decreasing sequence. Both sequences are bounded below by s 2 and above by S 1 . There- 
fore, each sequence {j 2? J and being monotonic and bounded, converges to a limit, 

say s 2n —> S', and s 2n _ x -> S”. But S’ = S" because 

S’ - S” = lim s 2n - lim = lim (s 2n - s^) = lim (-a„) = 0 . 

n~* oo oo n~* oo n~* oo 

If we denote this common limit by S, it is clear that the series converges and has sum S. 

To derive the inequalities in (10.48) we argue as follows : Since s 2n fi and ,S 2) j_ 1 \, we have 

$2n < ^2m+2 ^ S and S ^ ^2»+i ^ ^2w — i l°r all n > 1 . 

Therefore we have the inequalities 

0 < S — s 2n f *S' 2 n+i ™ — ^ 2 n+i and 0 < s 2n _ j — S ^ s 2n — i s 2n — a 2n s 

which, taken together, yield (10.48). This completes the proof. 

example 1. Since l/n\ and l/n—r 0 as n — > oo, the convergence of the alternating 
harmonic series 1 — i + J — j < > . is an immediate consequence of Leibniz’s rule. The 
sum of this series is computed below in Example 4. 

example 2. The alternating series J (- 1)” (log n)/n converges. To prove this using 
Leibniz’s rule, we must show that (log n)/n -+ 0 as n —r co and that (log n)/n\. The first 
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statement follows from Equation (10.11) of Section 10.2. T£> prove the second statement, 
we note that the function f for which 


f(x) = ^ X when x > 0 

has the derivative f ‘(x) = (1 — log x)/x 2 . When x > e, this is negative and / is monotonic 
decreasing. In particular, fin + 1) <f(n) for n > 3 . 


EXAMPLE 

rule, Let 

3. An important limit relation 
r 

may be derived as 

l 

a consequence of Leibniz’s 


a, = 1 , 

2 dx 

a 2 Y 5 

1 X 

a s ~ l > 

fl 4 =J 

3 dx 

lx’ " ' ' 

where, in 

general, 






1 

«2„-l = n 

and a 2n 

f n+1 dx 

Jn X 

for n 

II 


It is easy to verify that a, ->■ 0 as n -► oo and that a, 1 . Hence the series 2 (— 
converges. Denote its sum by C and its nth partial sum by s n . The (2n — l)st partial sum 
may be expressed as follows: 


-i = 1 - f 

J i 


= 1 + 


r 2 dx 

1 f 3 dx 

1 [ n 

— 

+ 

>1 

1 

+ 


J 1 X 

1 J 2 X 

« - 1 Jn- 

1 

1 [ n dx 

, 1 

~ + ■ 

• ■ + - - — = 

1 + = + ■ • ‘ + ■ 

2 

n J, x 

2 i 


dx 1 
x n 

- log n . 


Since s in -i — > C as n —y oo, we obtain the following limit formula: 

(10.49) lim ( + l + . . . + - - log n )= C . 

n- =oV 2 n J 


The number C defined by this limit is called Euler's constant (sometimes denoted by y). 
Like 7 r and e, this number appears in many analytic formulas. Its value, correct to ten 
decimals, is 0.5772156649. An interesting problem, unsolved to this time, is to decide 
whether Euler’s constant is rational or irrational. 

Relation (10.49) can also be expressed as follows: 


(10.50) 


^ ^ = log n + C + o(l) 

fc=l 


as n — > oo , 


From this it follows that the ratio (1 + J + < > < + l/«)/log n — >■ 1 as n —> oo, so the partial 
sums of the harmonic series are asymptotically equal to log n. That is, we have 



k = l 


as n -*■ Q 0 . 
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The relation (10.50) not only explains why the harmonic series diverges, but it also gives 
us some concrete idea of the rate of growth of its partial sums. I n the next example we use 
this relation to prove that the alternating harmonic series has the sum log 2. 

example 4. Let s m = We know that s m tends to a limit as m - > oo, 

and we shall prove now that th:is limit is log 2. When m is even, say m = 2 n, we may 
separate the positive and negative terms to obtain 


2n 


-Z±i-Zi= (2i-2i -2s-2t-2t- 


fc= 1 fc=l 


1 1 


Applying (10.50) to each sum on the extreme right, we obtain 

s 2n = (log 2n + C -|~ o(l)) - (log n + C + o(l» = log 2 + o(l) , 


so s in —*■ log 2 as n — »• oo. This proves that the sum of the alternating harmonic series is log 2. 


10.18 Conditional and absolute convergence 

Although the alternating harmonic series J (— 1 ) n ~ l /fl is convergent, the series obtained 
by replacing each term by its absolute value is divergent. This shows that, in general, 
convergence of ^ a, does not imply convergence of ^ \a n \. In the other direction, we have 
the following theorem. 


theorem 10.15. Assume ^ l«,J conver 9 es - Then £ a, also converges, and we have 


(10.51) 


CO 


Z a n 


oo 


< 2 Kl ■ 


Proof. Assume first that the terms a, are real. Let b n = a, + \a n \. We shall prove 
that ^ b n converges. It then follows (by Theorem 10.2) that ^ a / converges because 

a > ~ b n “ l^nl* 

Since is either 0 or 2 \a n \, we have 0 < b n < 2 |a„|, and hence 2 l a nl dominates ^ ■ 

Therefore J b n converges and, as already mentioned, this implies convergence of ^ a / • 
Now suppose the terms a, are complex, say 0, = u n + iv, , where u n and v n are real. 
Since \u n \ < \a n \, convergence of J \ a n\ implies convergence of J \ u n\ an( I this, in turn, 
implies convergence of J u n , since the u n are real. Similarly, ^ v „ converges. By linearity, 
the series 2 ( w n + ' V J converges.. 

To prove (10.51), we note that |2 ?=i 0*1 < 2?=i l a *l> an( I t^ 611 we n * C0 ‘ 

definition. A series ^ a n is called absolutely convergent if \a n \ converges, It is 

called conditionally convergent if a, converges but 2 Kl diverges. 

If 2 a > an d 2 b n are absolutely convergent, then so is the series ^ ( aa n + fib n ) for every 
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choice of a and fj. This follows at once from the inequalities 

M MM co oo 

2 \oca„ + fib n | < |a| 2 \a n \ + |/5| 2 \ b n\ < 1*1 2 Kl + I/ 3 ! 2 Al - 

71=1 77=1 71=1 71=1 71=1 

which show that the partial sums of 2 \ota n + f$b n | are bounded. 

10.19 The convergence tests of Dirichlet and Abel 

The convergence tests of the earlier sections that were developed for series of nonnegative 
terms may also be used to test absolute convergence of a series with arbitrary complex 
terms. In this section we discuss two tests that are often useful for determining convergence 
when the series might not converge absolutely. Both tests make use of an algebraic identity 
known as the Abel partial summation formula, named in honor of the Norwegian mathe- 
matician Niels Henrik Abel (1802-1829). Abel’s formula is analogous to the formula for 
integration by parts and may be described as follows. 

THEOREM 10.16. ABEL'S PARTIAL SUMMATION FORMULA. Let { Cl , j Olid (b,j be tW O 

sequences of complex numbers, and let 

n 

A, =2fl t . 
i 

Then we have the identity 

( 10 - 52 ) i a A = A n b n+1 + 2 A k (b k - b k+1 ) . 

k= 1 1 

Proof. If we define A, = 0, then a k = A, — A k _ t for each k = 1,2,--., n, so we have 

n n n n 

2 = I(A- A k-i)h = 2 A kh - 2 A kh+i + ^ A + i , 

fc=l k=l k=l lc= 1 

which gives us (10.52). 

If we let fl — > co in (10.52), we see that the series 2 a k b k converges if both the series 
2 A k (b k — b k+1 ) and the sequence {A n b n+ fj converge. The next two tests give sufficient 
conditions for these to converge. 


theorem 10.17. dirichlet 1 s test. Let 2 a < be a series o f complex terms whose partial 
sums form a bounded sequence. Let {b,j be a decreasing sequence which converges to 0. 
Then the series ^ a n b n converges. 

Proof. Using the notation of Theorem 10.16, there is an M > 0 such that \A n \ < M 
for all n. Therefore A n b n+1 — * 0 as n — ► CO, To establish convergence of 2 a A > we need 
only show that the series 2 A k( b k ~~ b k+i) is convergent. Since b n \, we have the inequality 


\ A ki b k ~ b k+ i) I < M{b k — b k+ 1) , 
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But the series 2 “ b k+1 ) is a convergent telescoping series which dominates 

2 A k (b k — b k+1 ) , 

This implies absolute convergence and hence convergence of ^ A k (b k — b k+1 ). 

theorem 10.18. abel's test. Let a, be a convergent series of complex terms and 
let (bj be a monotonic convergent sequence of real terms. Then the series ]£ converges. 

Proof. Again we use the notation of Theorem 10.16. Convergence of a, implies 
convergence of the sequence {A n } and hence of the sequence {A n b n+1 }. Also, {A,} is a 
bounded sequence. The rest of the proof is similar to that of Dirichlet’s test. 

To use Dirichlet’s test effectively, we need some examples of series having bounded 
partial sums. Of course, every convergent series has this property. An important example 
of a divergent series with bounded partial sums is the geometric series 2 X n , where x is a 
complex number with |x| — 1 but x 1. The next theorem gives an upper bound for the 
partial sums of this series. When | jc] = 1, we may write x = e ilB , where 0 is real, and we 
have the following. 


theorem 10.19. For every real 0 not an integer multiple of w, we have the identity 


(10.53) 


sr * ike _ sin n6 n n+1)e 
sin 0 

k~l 


from which we obtain the estimate 


(10.54) 



Tc = 1 


< 


1 

| sin 6 \ 


Proof. Ifx 1, the partial sums of the geometric series are given by 



k=l 

Writing x = in this formula, where Q is real but not an integer multiple of it, we find 


y e 2m = 

k=l 


e w 


e 2m8 _ y 



e in9 





sin nO Mn+ue 
sin 6 


This proves (10.53). To deduce (10.54), we simply note that (sin nd \ < 1 and \e lt - n+1)9 \ = 1. 
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examples. Assume { b n } is any decreasing sequence of real numbers with limit 0. Taking 
a, = x n in Dirichlet’s test, where X is complex, |x| = 1, x ^ 1, we find that the series 

oo 

(10.55) 2 b n x” 

n=l 

converges. Note that Leibniz’s rule for alternating series is merely the special case in which 
x = — 1. If we write x = e lB , where 6 is real but not an integer multiple of 2w, and consider 
the real and imaginary parts of (10.55), we deduce that the two trigonometric series 

GO OO 

2 b n cos nO and 2 b n sin nd 

n = 1 n=l 

converge. In particular, when b n = n~ a , where a > 0, we find the following series converge: 

■Sr' tus riO * sin n6 

Z n« ’ c tf - 

n= 1 71—1 

When a > 1, they converge absolutely since they are dominated by ^ fT*. 



n= l 
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17 


18 


• (1 - «sin-j ■ 

n - 1 n ' 

oo 

. ^ ( — i )’ 1 (i — cos ■ 

H=\ H ' 

go 

19. "V ( — l)"arctan 


2n + 1 . 


21 


■ ^ ( — 1)" I - — arctan (log n ) 

«= i 

• 2 io g( 

n= 1 ' 


1 + -r 


|sin «| 


27. a„ where a, = 

CO 

28. a„ where a, = 

»_ / i \'312 

29 


22. sin I htt + , 

log n 


n--=2 

CO 


23. Z-j «(1 + 1/2 + . . . + 1 jn) ' 


24. ^O 1 )" 


»=1 


c - 1 + 


(- 1 )" 


25. Zj(» + (-!)»)* 

26. (-l)«<«-D/2 IlL- 

n= l ' 


100 


1 In 

if n is a square, 

1/rt 2 

otherwise. 

1/n 2 

if n is odd. 

-1 In 

if n is even. 


’■ 2 ("\} ■ 

n=l 


3 0 


oo . 

2 sm ^1 

n 


m 


31 


32 


n= 1 


• 2 ( i - nsi % 
n=l ' 

00 

-sr' 1 — n sin (1 /n) 

■£* n 

n=\ 


In Exercises 33 through 46, describe the set of all complex z for which the series converges. 
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In Exercises 47 and 48, determine the set of real x for which the given series converges. 

oo „ - 00 

2 2" sin 2 ” x 

(-1)” . 48. 

71= 1 


'2” sin" x 


n 




In Exercises 49 through 52, the series are assumed to have real terms. 

49. If a n > 0 and ^ a, converges, prove that J 1 /«« diverges. 

50. If ^ \a n \ converges, prove that ^ converges. Give a counterexample in which £ a \ converges 
but ^ M diverges. 

5 1. Given a convergent series Y a„ , where each a, >0. Prove that J Va n rr p converges if p > 
Give a counterexample lor p = 

52. Prove or disprove the following statements: 

(a) If ^ a, converges absolutely, then so does 2 «|/(l + a 2 ). 

(b) If ^ a, converges absolutely, and if no a, = -1, then ]Ta„/( 1 + a,) converges absolutely. 


*10.21 Rearrangements of series 

The order of the terms in a finite sum can be rearranged without affecting the value of 
the sum. In 1833 Cauchy made the surprising discovery that this is not always true for 
infinite series. For example, consider the alternating harmonic series 

(10.56) 1 - k + i - l + 5 - i + - • • • = log 2 • 

The convergence of this series to the sum log 2 was shown in Section 10.17. If we rearrange 
the terms of this series, taking alternately two positive terms followed by one negative 
term, we get a new series which can be designated as follows: 


(10.57) 1 + l ~ i + i + i - i + i + tt - i + + - 1 1 • ■ 


Each term which occurs in the alternating harmonic series occurs exactly once in this 
rearrangement, and vice versa. But we can easily prove that this new series has a sum 
greater than log 2. We proceed as follows: 

Let t r denote the nth partial sum of (10.57). If n is a multiple of 3, say n = 3 in, the 
partial sum ) 3m contains 2m positive terms and m negative terms and is given by 

‘ 3m = Zik - 1 

k=l 

In each of the last three sums, we use the asymptotic relation 



to obtain 


^ = log n + C + o(l) 

fc=i 


00 


i m = (log 4 m + c+ 0 ( 1 )) - !(log 2m + c + o(l)) - Klog m + C + o(l)) 

= f log 2 + o(i) . 
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Thus t 3ril -> | log 2 as m -»■ CO. But t 3m+l = f 3m + l/(4w + 1) and / 3m _j = f 3m - l/(2m), 
so ? 3m+1 and l 3m _ 1 have the same limit as t 3m when m — > 00, Therefore, every partial sum 
t n has the limit f log 2 as n — > oo, so the sum of the series in (10.57) is § log 2. 

The foregoing example shows that rearrangement of the terms of a convergent series 
may alter its sum. We shall prove next that this can happen only if the given series is 
conditionally convergent. That is, rearrangement of an absolutely convergent series does 
not alter its sum. Before we prove this, we will explain more precisely what is meant by a 
rearrangement. 


definition. Let P = {1,2, 3, . . .} denote the set of positive integers. Let f be a function 
whose domain is P and whose range is P. and assume f has the following property: 

m n implies f(m)^f («) . 

Such a function f is called a permutation of P, or a one-to-one mapping of P onto itself. If 
2 a, and h n are two series such that for every n > 1 we have 

b% i Qf( n ) 

for some permutation f then the series ^ b n is said to be a rearrangement of ^ a, . 

example . If ^ a > denotes the alternating harmonic series in (10.56) and if J b n denotes 
the series in (10.57), we have b n = a ffn ) , wherefis the permutation defined by the formulas 

f(3n + 1) =4n+ 1 , f{3n + 2) = 4n + 3 , /(3n + 3 ) = 2n + 2 , 


theorem 10.20. Let 2 a, be an absolutely convergent series having sum S. Then every 
rearrangement of J a, also converges absolutely and has sum S. 


Proof. Let 2 b n be a rearrangement, say b n = Ctp n ) ■ First we note that J b n converges 
absolutely because 2 | b n \ is a series of nonnegative terms whose partial sums are bounded 
above by £ 

To prove that ^ b n also has su.m S, we introduce 


B n =lb 


k > 


fc=l 


K = 2 Kl 

fe— i 


and 


S* 


= 2KI 

k = 1 


Now A, — >• S' and A* — > S* as n —r co. Therefore, given any e > 0, there is an N such that 


| A n - S\ < j and \A* N - S*\ < 


For this N we can choose M so that 


{l,2,...,W} S {/(l),/(2),,..,/(M)}. 
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This is possible because the range offincludes all the positive integers. If ri > M, we have 
(10.58) | B n - SI = | B n — A n + A n - SI < | B n -A s \+ IA, - S\ < \B n - A N \ + - . 


But we also have 



n N 


n N 

II 

X 

1 

c 

X b k-2a k 

1 7c= 1 

= 

'X a f(k) “ 

fc=l k = 1 


The terms a, ,..., a N cancel in the subtraction, so we have 

\B n ~ A n \ < |ajv+il + l«iv+2l + • • • = \A n — S*\ < - . 


Combining this with (10.58), we see that \B n — 5| < € for all n > M, which means that 
B n S as n — » oo. This proves that the rearranged series 2 b n has sum S. 


The hypothesis of absolute convergence in Theorem 10.20 is essential. Riemann dis- 
covered that a conditionally convergent series of real terms can always be rearranged to give 
a series which converges to any preassigned sum. Riemann’ s argument is based on a special 
property of conditionally convergent series of real terms. Such a series 2 a, has infinitely 
many positive terms and infinitely many negative terms. Consider the two new series 2 <2^ 
and 2 a~ obtained by taking the positive terms alone and the negative terms alone. More 
specifically, define a + n and a~ as follows: 


(10.59) 


K- 


a n + 1 a n\ 


aZ = 


If a„ is positive, then a+ = a, and a n = 0; if a, is negative, then a; = a, and af = 0. 
The two new series 2 and 2 a n are related to the given series 2 a, as follows. 


theorem 10.21. G iven a series 2 a » °f real terms, define a+ and a~ n by (10.59). 

(a) If 2 a, is conditionally convergent, both 2 a n and 2 diverge. 

(b) If^a, is absolutely convergent, both 2 a h and 1 a „ converge, and we have 

oo oo oo 

(10.60) 2 a n = 2X + 2X' 

n— 1 rc=l n= 1 

Proof. To prove part (a), we note that Y \a n converges and 2 \\ a n\ diverges. Therefore, 
by the linearity property (Theorem 10.3) 2 a n diverges and 2 a ! diverges. Tb prove part 
(b), we note that both 2 \a n and 2 z\ a n\ converge, so by the linearity property (Theorem 
10.2) both 2 a ' n and 2 a ’ conver g e - Since a, = a n + a; , we also obtain (10.60). 

Now we can easily prove Riemann’ s rearrangement theorem. 


theorem 10.22. Let 2 3/ be a conditionally convergent series of real terms, and let S 
be a given real number. Then there is a rearrangement 2 b n of 2 a > which converges to the 
sum S. 
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Proof. Define of and a~ as indicated in (10.59). Both series ^ a ' n and ^ o~ diverge 
since ^ a > is conditionally convergent. We rearrange ^ a, as follows: 

Take, in order, just enough positive terms so that their sum exceeds S. Ifp, positive 
terms are required, we have 

> S but '2a n <S if q <p x . 

n — 1 n — 1 

This is always possible since the partial sums of J a~ tend to + co. To this sum we add 
just enough negative terms a~ , say negative terms, so that the resulting sum is less than S. 
This is possible since the partial sums of CT n tend to — 00, Thus, we have 

2>i wi 2>i m 

2 a t + 1 < s but 2 at + £ a~ > S if m < n 1 . 

n~ 1 n— 1 n = 1 n — 1 

Now we repeat the process, adding just enough new positive terms to make the sum exceed 
S, and then just enough new negative terms to make the sum less than S. Continuing in 
this way, we obtain a rearrangement J b„ . Each partial sum of V b n differs from S by at 
most one term a+ or a~ . But a, 0 as n -> co since J a > converges, so the partial sums 
°f 2 ten d t0 S. This proves that the rearranged series 2 b n converges and has sum S, 
as asserted. 


10.22 Miscellaneous review exercises 

1. (a) Let a, = V n + 1 — sfn. Compute lim^^oo a, . 

(b) Let a, = (n + l) c — n c , where c is real. Determine those c for which the sequence {a,} 
converges and those for which it diverges. In case of convergence, compute the limit of the 
sequence. Remember that c can be positive, negative, or zero. 

2. (a) If 0 < x < 1, prove that (1 + x n ) 1/n approaches a limit as n — r co and compute this 
limit. 

(b) Given a > 0, b > 0, compute lim^^a' 1 + b n ') lln . 

3. A sequence {a,} is defined recursively in terms of a, and a 2 by the formula 


_ a n + a n _ x 
On— l S 


for n >2. 


(a) Assuming that {a,} converges, compute the limit of the sequence in terms of a, and a 2 - 
The result is a weighted arithmetic mean of a, and a 2 ■ 

(b) Prove that for every choice of a x and a 2 the sequence {a,} converges. You may assume that 
< a,, . [Hint: Consider {a,,} and {a 2n+ i} separately.] 

4. A sequence { x, } is defined by the following recursion formula: 

X :L = 1 , x n+1 = Vl + x n 


Prove that the sequence converges and find its limit. 

5. A sequence {x n ) is defined by the following recursion formula: 


*o = 1 


*i = l. 


1 1 | 1 

x n + 2 *n+l 


Prove that the sequence converges and find its limit. 
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6. Let {a,} and { b n } be two sequences such that for each n we have 

e an = a, + e 6 ’ 1 

(a) Show that a, > 0 implies b n > 0. 

(b) If a„ > 0 for all li and if ^ a > converges, show that ^ (b r Ja n ) converges. 

In Exercises 7 through 11. test the given series for convergence. 

00 

co — \ 

7 ‘ „?! (V l + " 2 “ 9 ' (log H) l0g " . 

1 71=2 

oo _ 

8. 2 n s (Vn + 1 - 2\/n + Vn - 1). 

W=1 

11. V 00 , « , where a, = 1/n if n is odd, a, = 1 In 2 if n is even. 

A*n=\ n 5 * 

12. Show that the infinite series 

00 

2 (V « a + 1 — \/ n a ) 

n= 0 

converges for a > 2 and diverges for a = 2. 

13. Given a, > 0 for each n. For each of the following statements, give a proof or exhibit a 
counterexample. 

(a) If J* 1 a, diverges, then ^”=1 a n diverges. 

(b) If 2<n=\ a “n conver g es > then 2“=i a nt n converges. 

14. Find all real c for which the series (« !) c /(3«) ! converges. 

15. Find all integers a > 1 for which the series ( n 0 3 /(®h) ! converges. 

T5. Let «j < n 2 < n 3 < . denote those positive integers that do not involve the digit 0 in their 
decimal representations. Thus n x = 1, n 2 = 2, . . . , n 9 = 9, « 10 = 11, . . . , n 18 = 19, «ig= 21, 
etc. Show that the series of reciprocals Itr 1 /»* converges and has a sum less than 90. 

[///>;/: Dominate the series by 9 2*=o (9/10)".] 

17. If a is an arbitrary real number, let s,(a) = 1“ + 2® + • • • + n a . Determine the following 
limit : 

sja + 1) 
lim r-- , 

n— * x tis n {a) 

(Consider both positive and negative a, as well as a = 0.) 

18. (a) If p and q are fixed integers, p >q> 1, show that 


10 . 


00 

■2; 

n=i 


,14-1 M . 


lim 

ft-MK 



(b) The following series is a rearrangement of the alternating harmonic series in which there 
appear, alternately, three positive terms followed by two negative terms: 

1 + 3+5 - i - i + 7+ i + i 1 i — h ~ i + + + — — 


Show that the series converges and has sum log2 -t-|logf. 
[Hint: Consider the partial sum s bn and use part (a).] 
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(c) Rearrange the alternating harmonic series, writing alternately p positive terms followed 
by q negative terms. Then use part (a) to show that this rearranged series converges and has 
sum log 2 + | log (p/q)- 

10.23 Improper integrals 

The concept of an integral § b a f(x) was introduced in Chapter 1 under the restriction 
that the function / is defined and bounded on a finite interval fa, b]. The scope of integration 
theory may be extended by relaxing these restrictions. 

To begin with, we may study the behavior of dx as b — > + oo. This leads to the 

notion of an infinite integral (also called an improper integral of the first kind) denoted by 
the symbol jff(x) dx. Another extension is obtained if we keep the interval fa, b] finite 
and allowf to become unbounded at one or more points. The new integrals so obtained 
(by a suitable limit process) are called improper integrals of the second kind. To distinguish 
the integrals of Chapter 1 from improper integrals, the former are often called “proper” 
integrals. 

Many important functions in analysis appear as improper integrals of one kind or 
another, and a detailed study of such functions is ordinarily undertaken in courses in 
advanced calculus. We shall be concerned here only with the most elementary aspects of 
the theory. In fact, we shall merely state some definitions and theorems and give some 
examples. 

It will be evident presently that the definitions pertaining to improper integrals bear a 
strong resemblance to those for infinite series. Therefore it is not surprising that many of 
the elementary theorems on series have direct analogs for improper integrals. 

If the proper integral Ja/W d . Y exists for every b > a, we may define a new function I 
as follows : 

1(b) = f f(x) dx for each b > a . 

*• a 

The function I defined in this way is called an infinite integral, or an improper integral of 
thejrst kind, and it is denoted by the symbol J® f (x) dx. The integral is said to converge 
if the limit 

(10.61) Him 1(b) = lim f b f(x)dx 

b -»+co 6->+ao J a 

exists and is finite. Otherwise, the integral f(x) dx is said to diverge. If the limit in 
(10.61) exists and equals A, the number A is called the value of the integral, and we write 

/*GO 

f(x) dx = A . 

Ja 

These definitions are similar to those given for infinite series. The function values 1(b) 
play the role of the “partial sums” and may be referred to as “partial integrals.” Note 
that the symbol f(x) dx is used both for the integral and for the value of the integral 

when the integral converges. (Compare with the remarks near the end of Section 
10.5.) 
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example 1. The improper integral J® x" s dx converges if s > 1 and diverges if .S' < I . 
To prove this, we note that 

(b 1 -*- 1 

r b — ■ if 1, 

Kb) = x~“ dx = 1 - s 

\log b i f s = 1 , 

Therefore Z(b) tends to a finite limit if and only if S > 1, in which case the limit is 

1 


f 


X s dx = 


5—1 


The behavior of this integral is analogous to that of the series for the zeta-function, 

example 2. The integral J® sin x dx diverges because 

1(b) = jj sin x dx = 1 — cos b , 
and this does not tend to a limit as b — »■ + oo. 


Infinite integrals of the form j% ; f(x) dx are similarly defined. Also, if f(x) dx and 
J® f(x) dx are both convergent for some c, we say that the integral f(x) dx is convergent, 
and its value is defined to be the sum 

(10.62) J_ x f( x ) dx = }_ x) f(x) dx + f(x) dx , 

(It is easy to show that the choice of c is unimportant.) The integral f (x) dx is said to 

diverge if at least one of the integrals on the right of (10.62) is divergent. 

example 3. The integral e~ a ^ dx converges if a > 0, for if b > 0, we have 

C b r b -at t | 

e~ a|a:| dx = e~ ax dx = e - as b ^ OO . 

Jt o - a a 

Hence J® e~ a ^ xl dx converges and has the value 1/a. Also, if b > 0, we have 

f° dx = f° e ax dx = - f° e ~ al dt = f" e ~ at dt . 

■l-b J-b -’b JO 

Therefore x dx also converges and has the value 1/a. Hence we have dx = 

2/a. Note, however, that the integral e ax dx diverges because J® M e~ ax dx diverges. 

As in the case of series, we have various convergence tests for improper integrals. The 
simplest of these refers to a positive integrand. 
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theorem 10.23. Assume that the proper integral f (x) dx exists for each b > a and 
suppose that fix) > 0 for all x > a. Then fix) dx converges if and only if there is a 
constant M > 0 such that 

[b 

| / (;:) dx < M for every b > a . 

This theorem forms the basis for the following comparison tests. 

theorem 10.24. Assume the proper integral fi a f (x) dx exists for each bfia and suppose 
that 0 If(x) < g(x) fir all x > a, where j“ g(x) dx converges. Then J® fix) dx also 

converges and 

Too j'co 

J, f(x)dx< J a g(x) dx . 

Note: The integral g(x) dx is said to dominate the integral fif f(x) dx. 

theorem 10.25. LIMIT comparison test. Assume both proper integrals fi f (x) dx and 
JJ' g(.x) dx exist for each b > a. where f (x) > 0 and g(x) > 0 for all x > a. If 

(10.63) lira ^ = c , where c 0, 

x-^+cc g(x) 

then both integrals J® f (x) dx and J® g(x) dx converge or both diverge. 

Note: If the limit in (10.63) is 0, we can conclude only that convergence of J - ® g(x) dx 

implies convergence of ff f(x) dx. 

The proofs of Theorem 10.23 through 10.25 are similar to the corresponding results for 
series and are left as exercises. 

example 4. For each real s, the integral J® e~ x x s dx converges. This is seen by com- 
parison with J® x~ 2 dx since e~ x x s lx~ 2 —* o as x -> + oo. 

Improper integrals of the second kind may be introduced as follows: Suppose f jg 
defined on the half-open interval ( a > b I, and assume that the integral fit) dt exists for 
each x satisfying a < x < h. Deline a new function I as follows: 

fix) = f f(t) dt if a < x < b . 

d x 

The function I so defined is called an improper integral of the second kind and is denoted 
by the symbol J£ + / (t) dt. The integral is said to converge if the limit 

(10.64) lim fix ) = lim f b f(t)dt 

x~*a+ ®->a+ 

exists and is finite. Otherwise, the integral ffi fit) dt is said to diverge. If the limit in 

(10.64) exists and equals A. the number A is called the value of the integral, and we write 

I \ fit) dt = A . 
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example 5. Let f{t) = t s if t > 0. If b >0 and x > 0, we have 

,, lb' - x'- 

i W =jv dt = hrrr - ■ f 

[log b — log X if S = 1 . 

When x ~> 0+, /( x) tends to a finite limit if and only ifs < 1. Hence the integral fjj + t~ s dt 
converges ifs < 1 and diverges ifs > 1. x 

This example may be dealt with in another way. If we introduce the substitution L = 1/m, 
dt = —u~ 2 du, we obtain 


ri nix 

t~‘ dt = y s-2 du . 

Jx Jl/& 

When * -*■()+, 1 /*->• +oo and hence t~ s dt = u s ~ 2 du, provided the last integral 

converges. By Example l, this converges if and only if s — 2 < — l, which means S < 1. 

The foregoing example illustrates a remarkable geometric fact. Consider the function 
f defined by the equation f(x) = x -f if 0 < x < 1. The integral , f (x) dx converges, 
but the integral [J + 7 t f 2 (x) dx diverges. Geometrically, this means that the ordinate set of 
/ has a finite area, but the solid obtained by rotating this ordinate set about the x-axis has 
an infinite volume. 

Improper integrals of the form J„ f(t) dt are defined in a similar fashion. If the two 
integrals J' + / (t) dt and f (t) dt both converge, we write 

/(>)■"+ ffw n. 

Note: Some authors write where we have written , 

The definition can be extended (in an obvious way) to cover the case of any finite number 
of summands. For example, if / is undefined at two points c < d interior to an interval 
[a, b], we say the improper integral (t) dt converges and has the value J® - f(t) dt + 
Jc+/ (t)dt+ j b d+ f (t) dt, provided that each of these integrals converges. Furthermore, we 
can consider “mixed” combinations such as f(t) dt + J” f(t) dt which we write as 
J” _f (t) dt, or mixed combinations of the form ( t) dt + / (t) dt + J* / (t) dt which 

we write simply as $ff(t) dt. 

example 6. The gamma function. If S > 0 the integral e _< t s_1 dt converges. This 
must be interpreted as a sum, say 

(10.65) J 0+ e-'f 1 dt + J” e-H s 1 dt . 

f e second integral converges for all real s, by Example 4. To test the first integral we put 
= 1 ju and note that 


r i n/x 

e~ i t s ~ 1 dt= , 

Jx Jl 


e 1,u u s 1 du 
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But J® e _1/u M” s_1 du converges for s > 0 by comparison with w ~ s_1 du. Therefore the 
integral e~H*~ x dt converges for s > 0. When 5 > 0, the sum in (10.65) is denoted by 
I’(s). The function Y so defined is called the gamma function, first introduced by Euler in 
1729. ft has the interesting property that r(« + 1) = n! when n is any integer >0. (See 
Exercise 19 of Section 10.24 for an outline of the proof.) 

The convergence tests given in Theorems 10.23 through 10.25 have straightforward 
analogs for improper integrals of the second kind. The reader should have no difficulty in 
formulating these tests for himself. 


10.24 Exercises 

In each of Exercises 1 through 10, test the improper integral for convergence. 
x 


V* 4 + 1 


' dx. 


dx. 


■ dx. 


■f 

■ /: 

. r , 

Jo V X 3 + 1 
f 00 1 

4. I — 7 = dx. 

Jo Ye x 

r m e ~ v * 

5. I — — dx. 

Jo+ y x 

11. For a certain real C the integral 


r 


6 

7 

8 

9. 

10 . 


Jo+ Vx 
p- log* 

' Jo+ 1 

■I 


cosh x 
~ dx 


Jo+ V x ] 


log* 

® dx 


2 X (log X) s ’ 


Cx 


x t + 1 2x + 1 


dx 


converges. Determine C and evaluate the integral. 

12. For a certain real C, the integral 


C 


2x z + 2C x + 1 


converges. Determine C and evaluate the integral. 

13. For a certain real C, the integral 


dx 


f 


\/\ + 2x2 - X + 1- 


dx 


converges. Determine C and evaluate the integral. 

14. Find the values of a and b such that 


a 


2x 2 + bx + a 
x(2x + a) 


- 1 d x = 1 . 
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15. For what values of the constants a and b will the following limit exist and be equal to 1 ? 


lim 

J)-*+CO 


,JI x 3 + ax 2 + bx 

— j— c/x, 

-J, X 2 + X + 1 


16. (a) Prove that 


■*o+ ' 


r~ h dx 

f 1 dx\ 

f h 

- + - 

= 0 andthat lim 

j-i x 

Jh x/ 

ft-*-)- co J- 


sin x dx = O . 


(b) Do the following improper integrals converge or diverge? 


r dx r ■ a 

— ; sin x dx . 

J - 1 X J- CO 

17. (a) Prove that the integral JJ + (sin x)/x dx converges. 

(b) Prove that lim^^xj* (cos t)/t 2 dt= I. 

(c) Does the integral JJ + (cos t)(t 2 dt converge or diverge? 

18. (a) Iff is monotonic decreasing for all x > 1 and if f(x) ->■ 0 as x + oo, prove that the 
integral J " f(x) dx and the series ^ f(n) both converge or both diverge. 

[Hint: Recall the proof of the integral test.] 

(b) Give an example of a nonmonotonic J for which the series 2 /(») converges and the in- 
tegral f(x) dx diverges. 

19. Let P(v) = dt, ifs > 0. (The gamma function.) Use integration by parts to show 

Y{s + 1) = sr(s). Then use induction to prove that T(« + 1) = «!if n is a positive integer. 

Each of Exercises 20 through 25 contains a statement, not necessarily true, about a function f 
defined for all x > 1- In each of these exercises, n denotes a positive integer, and l n denotes the 
integral J - ] 1 f(x) dx, which is always assumed to exist. For each statement either give a proof or 
provide a counterexample. 

20. If/ismonotonic decreasing and if lim„^ a / n exists, then the integral j” f(x) dx converges. 

21. If lim^oo/fx) = 0 and lim„. , x , I n = A, then J” f(x)dx converges and has the value A. 

22. If the sequence {I n } converges, then the integral J” f(x) dx converges. 

23. If f is positive and if lim „_»«,/« = ^/ then f{x) dx converges and has the value A. 

24. Assume/'f^) exists for each X S: 1 and suppose there is a constant M> 0 such that \f'(x)\ < M 
for all X > 1. If lim^oo /„ = h, then the integral J” f{x) dx converges and has the value A, 

25. If f(x) dx converges, then lim^^ f(x) = 0. 




11 


SEQUENCES AND SERIES OF FUNCTIONS 


11.1 Pointwise convergence of sequences of functions 

In Chapter 10 we discussed sequences whose terms were real or complex numbers. Now 
we wish to consider sequences {f n \ whose terms are real- or complex-valued functions 
having a common domain on the real line or in the complex plane. For each x in the 
domain, we can form another sequence {/„(x)}of numbers whose terms are the corre- 
sponding function values. Let S denote the set of points x for which this sequence converges. 
The function f defined on S by the equation 


f(x) = lim f n (x) if xeS , 

n-*oc 

is called the limitfinction of the sequence {/„}, and we say that the sequence {f n } converges 
pointwise to f on the set S. 

The study of such sequences is concerned primarily with the following type of question: 
If each term of a sequence { f n } has a certain property, such as continuity, differentiability, 
or integrability, to what extent is this property transferred to the limit function? For 
example, if each function f n is continuous at a point x, is the limit function / also continuous 
at x? The following example shows that, in general, it is not. 


example 1. A sequence of continuous functions with a discontinuous limit function. Let 
f n (x) = x n if 0 <x < 1. The graphs of a few terms are shown in Figure 11.1. The sequence 
{fn} converges pointwise on the closed interval [0, 1], and its limit function f is given by 
the formula 


f(x) = Dim x n 
n-*co 


0 

1 


if 0 < x < 1 , 
if X = 1 . 


Note that the limit function f is discontinuous at 1, although each term of the sequence is 
continuous in the entire interval [0, !]■ 


example 2. A sequence for which lim,, x f n (x) dx ^ $ b a lim„ x f n (x) dx. Let f n {x) = 
nx(l — x 2 ) n for 0 <x < 1. In this example, the sequence {/„} converges pointwise to a 
limit function f which is 0 everywhere in the closed interval [0, 1]. A few terms of the 
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sequence are shown in Figure 11.2. The integral of f n over the interval [0, 1] is given by 



dx = nj^ x(l - 


x 2 )" dx = — 


n (1_ — x 2 ) n+1 1 n 
2 n + 1 0~2(n + 1) 


Therefore we have lim„ , x jj/„(x) dx = but J’J lim^^ f n (x) dx = 0. In other words, 
the limit of the integrals is not equal to the integral of the limit. This example shows that 




Figure 11.1 A sequence of continuous func- Figure 11.2 A sequence of functions for 

tions with a discontinuous limit function. which f n -> 0 on the interval [0, 1] but 

flf« - J as « - co. 

the two operations of “limit” and “integration” cannot always be interchanged. (See also 
Exercises 17 and 18 in Section 11.7.) 

George G. Stokes (1819-1903), Phillip L. v. Seidel (1821-1896), and Karl Weierstrass 
were the first to realize that some extra condition is needed to justify interchanging these 
operations. In 1848, Stokes and Seidel (independently and almost simultaneously) intro- 
duced a concept now known as uniform convergence and showed that for a uniformly 
convergent sequence the operations of limit and integration could be interchanged. 
Weierstrass later showed that the concept is of great importance in advanced analysis. W e 
shall introduce the concept in the next section and show its relation to continuity and to 
integration. 

11.2 Uniform convergence of sequences of functions 

Let {f n } be a sequence which converges pointwise on a set S to a limit function / By the 
definition of limit, this means that for each x in S and for each t > 0 there is an integer N, 
which depends on both x and e, such that f n (x) — f(x)\ < e whenever n > N. If the same 
N serves equally well for all points x in S, then the convergence is said to be uniform on S. 
That is, we have the following. 
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definition. A sequence of functions {f n } is said to converge uniformly to f on a set S 
'ff or every e > 0 there is an N (depending only on e) such tl iat n~^_N implies 

\m — fix) <£ for all X in S . 

We denote this symbolically by writing 

fn f uniformly on S . 



Figure 11.3 Geometric meaning of uniform convergence. If n !> N, the entire graph 
of each f n lies within a. distance 5 from the graph of the limit function f. 


When the functions f n are real-valued, there is a simple geometric interpretation of 
uniform convergence. The inequality \f n {x) — f(x)\ < e is equivalent to the pair of 
inequalities 

f(x) - e < f n (x) <f(x) + £ . 

If these hold for all n > N and every x in S, then the entire graph of f„ above S lies within 
a band of height 2e situated symmetrically about the graph off, as indicated in Figure 11.3. 


11.3 Uniform convergence and continuity 

Now we prove that uniform convergence transmits continuity from the individual terms 
of the sequence {f n } to the limit function f 

theorem 11.1 Assume f n — > f uniformly on an interval S. If each function f n is con- 
tinuous at a point p in S, then the limit function f is also continuous at p. 

Proof. We will show that for every e > 0 there is a neighborhood N(p) such that 
If(x) ~ f(p) I K 6 whenever x e N(p) n S. If e > 0 is given, there is an integer N such 
that n N implies 

IA00 — /Ml < “ for all X in S . 

Since f N is continuous at p, there is a neighborhood N(p) such that 
1/vM -/v(p) I < ^ for all X in N(p) n S . 
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Therefore, for all x in N(p) n S, we have 

I/M - m I = l/M - fs M + /vM - /vM + /vM - /M 

^ I/M — /v( x )l + 1/vW — /v(jP)| + 1/vM —/Ml • 

Since each term on the right is < ® 13, we find \ f{x) -f(p) | < e, which completes the proof. 

The foregoing theorem has an important application to infinite series of functions. If 
the function values f n (x) are partial sums of other functions, say 


n 

fn 0) = 2 U *M > 


fc=l 


and if f n — >f pointwise on S, then we have 

00 

f(x) = lim f n (x) = ^ u k (x) 

n~* oo k=l 

for each x in S. In this case, the series is said to converge pointwise to the sum function 
f. If f n - > /' uniformly on S, we say the series ^ l h converges uniformly to f. If each term 
U k is continuous at a point p in S, then each partial sum f n is also continuous at p so, from 
Theorem 11.1, we obtain the following corollary. 

theorem 11.2. If a series of functions ^ U k converges uniformly to a sum function f on 
a set S, and if each term u k is continuous at a point p in S, then the sum f is also continuous 
atp. 

Note: We can also express this result symbolically by writing 


CO 00 

lim ^ u kM = 2 lim u k(x) • 
1 fc=x x— >■$ 


We describe this by saying that for a uniformly convergent series we may interchange 
the limit symbol with the summation symbol, or that we can pass to the limit term by 
term. 


11.4 Uniform convergence and integration 

The next theorem shows that uniform convergence allows us to interchange the integration 
symbol with the limit symbol. 


theorem 11.3. Assume fn -/ uniformly on an interval [a, b], and assume that each 
function f n is continuous on [a, b]. Define a new sequence (g,j by the ec/uation 


Sn( x ) = f f n (t) dt if X £ [a, b ] , 
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theorem 11.4. Assume that a series of functions ^ u k converges uniformly to a sum 

function f on an interval [a, b], where each u k is continuous on [a, b], Ifx e [a, b], define 

n P x * 

u k(t)dt and g(x) = f(t) dt . 

J c 

Then g n —*■ g uniformly on [a, b]. In other words , we have 

n f*X f*X fl 

lim ^ u k ( 0 dt = lim 2 u *(0 dt 

n~* oo k — 1 J a Ja n-* oo k = 1 

or 

00 fx /*« 00 

2 Mfc(0 dt = I 2 «*(0 dt . 

k~ 1 Ja J a fc=l 

Proof. Apply Theorem 11.3 to the sequence of partial sums \ f n ) given by 

m = i um , 

k=l 


and note that J* /„(?) dt = 2*=i Ja dt- 
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Theorem 11.4 is often described by saying that a uniformly convergent series may be 
integrated term by term. 


11.5 A sufficient condition for uniform convergence 

Weierstrass developed a useful test for showing that certain series are uniformly con- 
vergent. The test is applicable whenever the given series Can be dominated by a convergent 
series of positive constants. 


theorem 11.5. the weierstrass M-TEST. Given a series of junctions ^u n which con- 
verges pointwise to a function f on a set S. If there is a convergent series ofpositive constants 
2 M n such that 


0 < \u n (x)\ < M n for every n > 1 and every x in S , 

then the series 2 u n converges uniformly on S. 

Proof. The comparison test shows that the series 2 u n i x ) converges absolutely for each 
x in S. For each x in S, we have 


f(x) - 2 u k (x) 

k= 1 



CO 

= 

2 u k(x) 

k~n + 1 


< 2 KWI < 2 M k . 

k=n + 1 k= n+1 


Since the series 2 converges, for every e > 0 there is an integer N such that n > N 
implies 

00 

2 M * < e ■ 

k=n-f 1 


This shows that 


n 


fix) - 2 u k(x) 

7r=l 


< e 


for all n > N and every x in S. Therefore, the series 2 u n conver g es uniformly tofon S. 

Term-by-term differentiation of an arbitrary series of functions is even less promising 
than term-by-term integration. For example, the series 2“=i ( sin flx )l n ' 2 converges for all 
real x because it is dominated by 2 Moreover, the convergence is uniform on the 

whole real axis. Flowever, the series obtained by differentiating term by term is 2 (cos nx)/n, 
and this diverges when x = 0. This example shows that term-by-term differentiation may 
destroy convergence, even though the original series is uniformly convergent. Therefore, 
the problem of justifying the interchange of the operations of differentiation and summation 
is, in general, more serious than in the case of integration. We mention this example so the 
reader may realize that familiar manipulations with finite sums do not always carry over 
to infinite series, even if the series involved are uniformly convergent. We turn next to 
special series of functions, known as power series, which can be manipulated in many 
respects as though they were finite sums. 
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11.6 Power series. Circle of convergence 
An infinite series of the form 


“ fl ) — a o + afz — a) + < . . + afz — a) n + ■ ■ ■ 

is called a power series in z — a. The numbers z, a, and the coefficients a, are complex. 
With each power series there is associated a circle, called the circle of convergence, such 
that the series converges absolutely for every z interior to this circle, and diverges for every 
z outside this circle. The center of the circle is at a and its radius r is called the radius of 



Figure 11.4 The circle of convergence of a power series, 


convergence. (See Figure 11.4.) In extreme cases, the circle may shrink to the single point 
a, in which case r = 0, or it may consist of the entire complex plane, in which case we say 
that r = + co. The existence of the circle of convergence is shown in Theorem 11.7. 

The behavior of the series at the boundary points of the circle cannot be predicted in 
advance. Examples show that there may be convergence at none, some, or all the boundary 
points. 

For many power series that occur in practice, the radius of convergence can be determined 
by using either the ratio test or the root test, as in the following examples. 

example 1. To find the radius of convergence of the power series 2 z"/n!, we apply 
the ratio test. If z ^ 0, the ratio of consecutive terms has absolute value 


z n+l yi'\ 

7 / 1 + 7 )! z " 


n + 1 


Since this ratio tends to 0 as n —> 00 , we conclude that the series converges absolutely for 
all complex z ^ 0. It also converges for z = 0, so the radius of convergence is + 00, 

Since the general term of a convergent series must tend to 0, the result of the foregoing 
example proves that 

z n 

lint — = 0 

«-» oo n! 
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for every complex z. That is, n! “grows faster” than the nth power of any fixed complex 
number z as n —*■ 00, 

example 2. To test the series ^ n 2 3"z", we use the root test. We have 

(n 2 3 n Izl") 1 /" = 3 \z\ n 2 l n -* 3 \z\ as n -> oo , 

since n 2 ! n = (n 1 l n f and w 1 /" — >• 1 as n — > oo, Therefore, the series converges absolutely if 
|z| < J and diverges if |z| > The radius of convergence is This particular power 
series diverges at every boundary point because, if |z| = J, the general term has absolute 
value « 2 , 

example 3. For each of the series 2 z ’‘l n and 2 z ”l n 2 J the ratio test tells us that the 
radius of convergence is 1. The first series diverges at the boundary point z = 1 but 
converges at all other boundary points (see Section 10.19). The second series converges 
at every boundary point since it is dominated by 1 /« 2 . 

We conclude this section with a proof that every power series has a circle of convergence. 
The proof is based on the following theorem. 

theorem 11.6. Assume the power series ]> a n z n converges for a particular z ^ 0, say 
fOrz= z 1 ■ Then we have: 

(a) The series converges absolutely for every z with |z| < |z x | . 

(b) The series converges uniformly on every circular disk with center at 0 and radius 

R < N- 

Proof. Since ^ a n z” converges, its general term tends to 0 as n — r oo, In particular, 
\a n z"\ < 1 from some point on, say for n > N. Let S be a circular disk of radius R, where 
0 < R < | zj. If z e S and n > N, we have |z| < R and 


z 

n 

2 

n 

R 


< 


< 


Zl 


Zl 


z i 


where t = 


R 

z i 


Since 0 < t < 1, the series 2 a n z n is dominated by the convergent geometric series 2 t n . 
By Weierstrass’ M-test, the series 2 a n Z n converges uniformly on S. This proves (b). The 
argument also shows that the series 2 a n Z n converges absolutely for each z in S. But since 
each z with |z| < | zj lies in some circular disk S with radius R < |Zj|, this also proves 
part (a). 


theorem n.7. existence of a circleof convergence. Assume that the power series 
2 ct n z n converges for at least one z ^ 0 , say for z= z r , and that it diverges for at least one 
z, say for z = z 2 . Then there exists a positive real number r such that the series converges 
absolutely if \z\ < r and diverges if |z| > r. 


Proof. Let A denote the set of all positive numbers |z| for which the power series 
2 a n z n converges. The set A is not empty since, by hypothesis, it contains |Zj|. Also, no 
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number in A can exceed |z 2 | (because of Theorem 11.6). Hence, |z 2 | is an upper bound 
for A. Since A is a nonempty set of positive numbers that is bounded above, it has a least 
upper bound which we denote by r. It is clear that r > 0 since r > IzJ. By the definition of 
r, no number in A can exceed r. Therefore, the series diverges if |z| > r. But it is easy 
to prove that the series converges absolutely if |z| < r. If |z| < r, there is a positive number 
x in A such that |z| < x < r. By Theorem 11.6, the series 2 d n z n converges absolutely. 
This completes the proof. 

There is, of course, a corresponding theorem for power series in z » a which may be 
deduced from the case just treated by introducing the change of variable Z = z — a. The 
circle of convergence has its center at a, as shown in Figure 11.4. 


11.7 Exercises 

In Exercises 1 through 16, determine the radius of convergence r of the given power series. In 
Exercises 1 through 10, test for convergence at the boundary points if r is finite. 
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a > 0, b > 0. 


16. 


n = 1 


a > 0, b > 0. 


17. If /„(x) = «;ce nx2 for 7 i = 1,2,. . . and x real, show that 

lim | /„(*) r/x 5^ j lim / n (x) dx . 

n—*cc Jo Jo w — ► '30 


This example shows that the operations of integration and limit cannot always be interchanged. 
18. Let f n (x) = (sin nx)jn, and for each fixed real x let f(x) = lim „_>«,/„(*). Show that 


l,m/;(0)^/'(0). 

n->co 



Properties of functions represented by real power series 


431 


This example shows that the operations of differentiation and limit cannot always be inter- 
changed. 

19. Show that the series ^“ =1 (sin nx)lrt 2 converges for every real x, and denote its sum by fix). 
Prove that f is continuous on [0, it], and use Theorem 1 1 .4 to prove that 


20. It is known that 



oo 


dx — I'y' 

n=\ 


1 

(2«- l) 3 ' 


2 


M=1 


cos nx 


„2 




if 0<x<2 7T. 


Use this formula and Theorem 1 1.4 to deduce the following formulas: 



^ (_ 1)»+1 7,3 

(b) 2 (In - l) 3 " 32 ' 

n = 1 


11.8 Properties of functions represented by real power series 

In this section we restrict ourselves to real power series, that is series of the form 
^ o n (z — d) n in which z, a, and the coefficients a, are all real numbers. We also write x in 
place of z. The interior of the circle of convergence intersects the real axis along an interval 
(a — r, a + r) symmetrically located about a; we refer to this as the interval of convergence 
of the real power series ^ a n( x — u) n . The number r is called the radius of convergence. 
(See Figure 11.5.) 


Divergence- 


-Absolute convergence 



a-f a a+r 

Figure 11.5 The interval of convergence of a real power series. 


Each real power series defines a sum function whose value at each X in the interval of 
convergence is given by 

oo 

/(*) = 2 a n(x - a) n ■ 
n =0 

The series is said to represent the function f in the interval of convergence, and it is called 

the power-series expansion off about a. 

There are two basic problems about power-series expansions that concern us here: 

(1) Given the series, to find properties of the sum function f. 

(2) Given a function /, to find whether or not it may be represented by a power series, 
It turns out that only rather special functions possess power-series expansions. Nevertheless, 
the class of such functions includes most examples that arise in practice, and hence their 
study is of great importance. We turn now to a discussion of question (1). 
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Theorem 11.6 tells us that the power series converges absolutely for each x in the open 
interval (a — r, a + r), and that it converges uniformly on every closed subinterval 
[a — R, a + R ], where 0 < R < r. Since each term of the power series is continuous on 
the whole real axis, it follows from Theorem 11.2 that the sum function f is continuous 
on every closed subinterval [a — R, a + R ], and hence on the open interval (a — r, a + r). 
Also, Theorem 1 1 .4 tells us that we can integrate the power series term by term on every 
closed subinterval [a — R, a + /?]. These properties of functions represented by power 
series are stated formally in the following theorem. 

THEOREM 11.8. Assume a function f is represented by the power series 

OO 

(11.1) f(x) = 2 fl„(x - a) n 

n— 0 

in an open interval (a — r, a + r). Then f is continuous on this interval , and its integral over 
any closed subinterval may be computed by integrating the series term by term. In particular ; 
for every x in (a — r, a + r), we have 

(* X oo t* x oo 

m dt=y a n (t — a) n dt = Y (x - af +1 

*j a J a, 72 X 


n= 0 


n=0 


Theorem 11.8 also shows that the radius of convergence of the integrated series is at 
least as large as that of the original series. We will prove presently that both series have 
exactly the same radius of convergence. First we show that a power series may be 
differentiated term by term within its interval of convergence. 

theorem 11.9. Let f be represented by the power series (11.1) in the interval of con- 
vergence (a — r, a + r). Then we have: 

(a) The differentiated series nafx - a)"- 1 also has radius of convergence r. 

(b) The derivative f ‘(x) exists for each x in the interval of convergence and is given by 


/'(*) = 2 ,m n( X ~ 


n — 1 


Proof. For simplicity, in the proof we assume that a = 0. First we prove that the 
differentiated series converges absolutely in the interval (-r, r). Choose any positive x 
such that 0 < x < r, and let h be a small positive number such that 0<x<x + h < r. 
Then the series for f (x) and for f (x + h) are each absolutely convergent. Hence, we may 
write 

(x + h) n - x ” 


(H.2) 


f(x + ft) - fix) _ 

k 


n =0 


The series on the right is absolutely convergent since it is a linear combination of absolutely 
convergent series. Now we apply the mean-value theorem to write 


(x + kf 


hnc 


n—1 
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where x < c n < x + h. Hence, the series in (11.2) is identical to the series 


(11.3) 


oo 


2 1 

71=1 


which must be absolutely convergent, since that in Equation (11.2) is. The series (H .3) 
is no longer a power series, but it dominates the power series 2 na K x n ~ 1 , so this latter series 
must be absolutely convergent for this x. This proves that the radius of convergence of 
the differentiated series ^ na n X n ~ l is at least as large as r. On the other hand, the radius of 
convergence of the differentiated series cannot exceed r because the differentiated series 
dominates the original series 2 a n x n . This proves part (a). 

To prove part (b), let g be the sum function of the differentiated series, 


oo 


g(x) = 2 na n x” 1 . 

71—1 


Applying Theorem 11.8 to g, we may integrate term by term in the interval of convergence 
to obtain 

f" x oo 

g(t)dt= 2 a n x n = f(x) - a,. 

J 0 71=1 


Since g is continuous, the first fundamental theorem of calculus tells us that f’{x) exists 
and equals g(x) for each x in the interval of convergence. This proves (b). 

Note: Since every power series ^ a n (x — a) n can be obtained by differentiating its 

integrated series, ^ a n (x — a) n+1 j(n + 1), Theorem 11.9 tells us that both these series 
have the same radius of convergence. 

Theorems 11.8 and 11.9 justify the formal manipulations of Section 10.8 where we 
obtained various power-series expansions using term-by-term differentiation and integration 
of the geometric series. In particular, these theorems establish the validity of the expansions 


: «,(-i)Y +i . . ^(_DV n 

> ; and arctan x = > 

xL, n + 1 Z, 2n + 1 

H=0 n= 0 


whenever x is in the open interval — 1 < x < 1. 

As a further consequence of Theorem 11.9, we conclude that the sum function of a power 
series has derivatives of every order and they may be obtained by repeated term-by-term 
differentiation of the power series. If f(x) = Y a n (x — a) n and if we differentiate this 
formula k times and then put x = a in the result, we find that f m (a)= k\a k , SO the kth 
coefficient a k is given by the formula 




f m (a) 

k\ 


for fc = 1, 2, 3 
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This formula also holds for k =0 if we interpret f i0) (a ) to mean f(a). Thus, the power- 
series expansion off has the form 

(11.4) f(x) = V-&) (x _ a )». 

k\ 

k = 0 

This property can be formulated as a uniqueness theorem for power-series expansions. 

theorem 11.10. If tWO power series J a n (x — a) n and ^ b n (x — a) n have the same sum 
function f in some neighborhood of the point a, then the two series are equal term by term; in 
fact, we have a, = b n = f {n) (a)/n ! for each n > 0. 


Equation (11.4) also shows that the partial sums of a power series are simply the Taylor 
polynomials of the sum function at a. In other words, if a functionfis representable by a 
power series in an interval (a — r a + r), then the sequence of Taylor polynomials 
{ T n f (x ; a)} generated by f at a converges pointwise in this interval 1° the sum function f. 
Moreover, the convergence is uniform in every closed subinterval of the interval of 
convergence. 


11.9 The Taylor’s series generated by a function 

We turn now to the second problem raised at the beginning of the foregoing section. 
That is, given a function f to find whether or not it has a power series expansion in some 
open interval about a point a. 

We know from what was just proved that such a function must necessarily have derivatives 
of every order in some open interval about a and that the coefficients of its power-series 
expansion are given by Equation (11.4). Suppose, then, that we start with a function / 
having derivatives of every order in an open interval about a. We call such a function 
infinitely differentiable in this interval. Then we can certainly fonn the power series 


(11.5) 



(x - of , 


This is called the Taylor’s series generated by f at a. We now ask two questions : Does this 
series converge for any x other than x = a ? If so, is its sum equal to fix) ? Surprisingly 
enough, the answer to both questions is, in general, “no.” The series may or may not 
converge for x ^ a and, if it does converge, its sum may or may not be f (x). An example 
where the series converges to a sum different from fix) is given in Exercise 24 in Section 
11.13. 

A necessary and sufficient condition for answering both questions in the affirmative can 
be given by using Taylor’s formula with remainder, which provides a finite expansion of 
the form 


f(x) = 2 


-f k \a) 


( x 


a)” + E,(x) 


(11.6) 


k\ 
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The finite sum is the Taylor polynomial of degree n generated by / at a, and E„(x) is the 
error made in approximatingf by its Taylor polynomial. If we let n -> 00 in (11.6), we 
see that the power series (11.5) will converge to f(x ) if and only if the error term tends to 0. 
In the next section we discuss a useful sufficient condition for the error term to tend to 0. 


11.10 A sufficient condition for convergence of a Taylor’s series 


In Theorem 7.6 we proved that the error term in Taylor’s formula could be expressed 
as an integral. 


(11.7) 


EM 


( X —t) n f + (t) 



in any interval about a in which f ( n +T is continuous. Therefore, iff is infinitely differentiable, 
we always have this representation of the error so the Taylor’s series converges to f ( X ) if 
and only if this integral tends to 0 as n—> 00, 

The integral can be put into a slightly more useful form by a change of variable. We 
write 

t = x + (a — x)u . dt = -(x — a) du , 


and note that u varies from 1 to 0 as t varies from a to x. Therefore, the integral in (11.7) 
becomes 

(11.8) E,(x) = (* ZL ?) n+ f u y<»+i>[* + ( a _ x)u] du 

nl Je> 

This form of the error enables us to give the following sufficient condition for convergence 
of a Taylor’s series. 


theorem 11 . 11 . Assume f is infinitely differentiable in an open interval Z = (a — r, a + r), 
and assume that there is a positive constant A such that 

(11-9) |/ ( b , ( jc )| <A” for n= 1,2,3,..., 

and every x in Z. Then the Taylor’s series generated by f at a converges to f (x) for each X in Z. 
Proof. Using the inequality (11.9) in the integral formula (11.8), we obtain the estimate 

\n+l jn+ 1 B n+1 


o < |E„(X)| < 


lx - a\ n+1 _ A n+ 1 
n\ 


f 1 

1 u n du = 
Je 


\x - a\ 


(n + 1)1 


(n + 1)J 


where B = A \x — a\. But for every B , S”/«! tends to 0 as n — > 00 , SO Efx) —*■ 0 for each 
X in Z. 


11.11 Power-series expansions for the exponential and trigonometric functions 

The sine and cosine functions and all their derivatives are bounded by 1 over the entire 
real axis. Therefore, inequality (11.9) holds with A = 1 if f (x) = sin x or if f (x) = COS x 
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and we have the power-series expansions 


x 3 x 5 

sin x bs x 1 — 

3! 5! 


x 7 

- + 

7! 


. . + 


(- 1 ) 


ft — 1 . 


(2b - 1)! 

Jin 


+ 


-+.■■ + (-l) n — + 

6! (2n)! 


valid for every real x. For the exponential function, f (x) = e x , we havef (n) (x) = e x for all 
x, so in any finite interval (~ r > r ) we have e x < e r . Therefore, (11.9) is satisfied with 
A = e r , Since r is arbitrary, this shows that the following power-series expansion is valid 
for all real x: 

x 2 x” 

e = l + x + — + • • . + — + •••• 

2 ! n\ 


The foregoing power-series expansions for the sine and cosine can be used as the starting 
point for a completely analytic treatment of the trigonometric functions. If we use these 
series as de/'nitions of the sine and cosine, it is possible to derive all die familiar algebraic 
and analytic properties of the trigonometric functions from these series alone. For example, 
the series immediately give us the formulas 

sin 0=0, COSO=1, sin (-x) = -sin x , COS (-x) = COS X , 

Dsinx==cosx, Dcosx = — sinx. 

The addition formulas may be derived by the following simple device: Let u and v be 
new functions defined by the equations 

u(x) = sin (x + a) — sin x COS a — COS X sin a, 

v(x) = cos (x + a) — cos x cos a + sin x sin a, 

where a is a fixed real number, and let f(x) - [w(x)] 2 + [i;(x)] 2 . Then it is easy to verify 
that u\x ) = v(x) and v’(x) = —u(x), and so f’(x) = 0 for all X, Therefore, f is a constant 
and, since f(0) =0, we must have f(x) = 0 for all x. This implies u(x) = v(x) — 0 for 
all X or, in other words, 


sin (x + a) = sin x COS a + COS X sin a , 
cos (x + a) = COS X COS a “ sin X sin a. 

The number ^ t may be introduced as the smallest positive x such that sin x = 0 (such an 
X can be shown to exist) and then it can be shown that the sine and cosine are periodic 
with period 2ir, that sin (|7r) = 1, and that cos (577) = 0. The details, which we shall not 
present here, may be found in the book Theory and Application of Infinite Series by 
K. Knopp (New York: Hafner, 1951). 




Bernstein's theorem 


4 3 7 


*11.12 Bernstein’s theorem 

Theorem 11.1 1 shows that the Taylor’s series of a functionfconverges if the nth derivative 
y<n) g rows no faster than the nth power of some positive number. Another sufficient 
condition for convergence was formulated by the Russian mathematician Sergei N. 
Bernstein (1880- ). 


theorem 11 . 12 . bernstein’s theorem. Assume f and all its derivatives are nonnegative 
on a closed interval [0, r]. That is, assume that 

/( x) > 0 and p n \x) > 0 

for each x in [0, r] and each n = 1, 2, 3 Then, ;/ 0 < x < r, the Taylor's series 

f w ( 0) r » 

k! X 

k=0 

converges to fix). 

Proof. The result holds trivially for x = 0, so we assume that 0 < x < f. We use 
Taylor’s formula with remainder to write 



( 11 . 10 ) 


fix) 



k=0 


+ 


E,(x). 


We will prove that the error term satisfies the inequalities 


( 11 . 11 ) 0 < E,(x) < Q - r "'f(r) . 

This, in turn, shows that E n (x) -*■ 0 as n -> oo since the quotient (.v/r)’ 1+1 — > 0 when 
0 < x < r. 

To prove (11.1 1), we use the integral form of the error as given in Equation (11.8) with 

a = 0: 

yW+1 /* 1 

E,(X) = — u n f in+1 \x - XU) du. 
n !. m 


This formula is valid for each x in the closed interval [0, r]. If x ^ 0, let 


F n (x) = 


E n (x) = 1 ’ 
x n+1 n) 


u n f 


(n+l) 


(X 


xu) du . 


The function f in+1) is monotonic increasing in the interval [0, r] since its derivative is 
nonnegative. Therefore, we have 


/ (,,+1) (jc - xu) = / ( ’ l+1 >[x(l - ii)] < /<"+«tr( 1 - u)] 




438 


Sequences and series off unctions 


if 0 < u < 1, which implies that F n (x) < F,(r) if 0 < x < r. In other words, we have 
E n (x)lx n+1 < E n (r )/r" +1 or 

( 11 - 12 ) . 

Setting x = r in Equation (1 1.10), we see that E n (r) <f(r) because each term in the sum 
is nonnegative. Using this in (11.12), we obtain (11.11) which, in turn, completes the proof. 


11.13 Exercises 


For each of the power series in Exercises 1 through 10 determine the set of all real x for which 
the series converges and compute the sum of the series. The power-series expansions given earlier 

in the text may be used whenever it is convenient to do so. 


1. 
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oo 
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oo 
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( — l) n lx\ 2n 


4. > (-1 fnx n . 
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oc 
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+ 1 


2 n + 1 \2 

i-l) n x 3n 

“ • 

x n 

](nTiy.' 

■ (* - l)” 

! in +2)!' 


Each of the functions in Exercises 11 through 21 has a power-series representation in powers of x. 
Assume the existence of the expansion, verify that the coefficients have the form given, and show 
that the series converges for the values of x indicated. The expansions given earlier in the text may 
be used whenever it is convenient to do so. 


n=Q 


12. sinh* 


•ST' X 2 


13. 


Zw(2 n+ 1)! 

»- o ’ 

2 21 


a > 0 (all x). [Hint: a* = e rl08a .l 
(all at). 


• sin 2 x = ')>'( — l) n+1 - — — x 2n (all x ). [Hint: cos 2x = 1 — 2 sin 2 a.] 
x-4. (2 n)\ 


n= l 


1 


14 , _ — = V — 
2 - JC A, 2 n+1 

n= 0 


15. e~ 


00 

=2 


(— l)*jr 


.2 n 


n = 0 


n.> 


(2 n)\ 

( 1*1 < 2 ), 

(all x). 


oo 


16. sin 3 x = - ^ (-1)" +1 

«= i 


3 2 " — 1 

(2 n + 1)! 




(all x). 
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17, 


«:=0 

1, 'TT7T23-3 1 2 [1 - ( - W (W< »- 

fl 1 

[Hint: 3*/(l + x - 2x 2 ) = 1/(1 — x) - i/(i + 2x0.] 


„2n+l 


19. 


20 . 


12 - 5y =y(i + ^\ 

— 5x — x 2 Zj\ 6" / 


x n (|*| < 1). 


2 . 2n(n + 1) 


- = i_y 

1 V3 ..4* 


r> + x + r V 3 A 81 ”— X- (W<1) - 

v J 71 = u 

[Hint: x 3 — 1 = (x — l)(x 2 + x + 1).] 


x 1 V/ 1 -( 

21 - (1 -X)(l -x 2 ) = 2^1" + 2 

"x 

22. Determine the coefficient a 98 in the power-series expansion sin (2x + j7 r) = ^^ 0 a n x n . 

23. Let f(x) = (2 + x 2 ) 5 2 . Determine the coefficients a„ a„ . . . , a 4 in the Taylor? series 
generated by / at 0. 

24. Let f(x) = e” 1/x2 if x ^ 0, and let f( 0) = 0. 

(a) Show thatf has derivatives of every order everywhere on the real axis. 

(b) Show that / <n) (0) = 0 for all n > 1. This example shows that the Taylor’s series generated 
by f about the point 0 converges everywhere on the real axis, but that it represents f only at 
the origin. 


■)*" (M < l). 


11.14 Power series and differential equations 

Power series sometimes enable us to obtain solutions of differential equations when 
other methods fail. A systematic discussion of the use of power series in the theory of 
linear second-order differential equations is given in Volume II. Here we illustrate with 
an example some of the ideas and techniques involved. 

Consider the second-order differential equation 

(1L13) (1 -**)/= -2 y. 

Assume there exists a solution, say y = f(x), which may be represented by a power-series 
expansion in some neighborhood of the origin, say 


(11.14) y = Za n x n . 

n = 0 

The first thing we do is determine the coefficients a„ a„ a 2 , ■ ■ ■ ■ 

One way to proceed is this: Differentiating (11.14) twice, we obtain 

00 

y" = • 

71=2 
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Multiplying by 1 — x 2 , we find that 

oo oo 

(11.15) (1 - x 2 )y" = 2 n(n - l)a+c"~ 2 - 2,«(n - l)a„x" 

n=2 n=- 

00 00 

= 20*1 + 2)(n + 1K+2X 71 - 2 «(« “ 1 ) a « x ” 

n = 0 «=0 

oo 

— 2 [(" + 2)(/l + l) a rs+2 n ( n ~ l) fl n]* B • 

«=0 

Substituting each of the series (111.14) and (11.15) in the differential equation, we obtain 
an equation involving two power series, valid in some neighborhood of the origin. By 
the uniqueness theorem, these power series must be equal term by term. Therefore we 
may equate coefficients of x n and obtain the relation 

(n + 2){n + l)a„ +2 - n(n - 1 )a n = -2 a n 

or, what amounts to the same thing. 


a n+ 2 — 


n 2 — n — 2 n-2 

-a„ = 


(n + 2 )(/? +1) n + 2 


This relation enables us to determine a t , a 4 , a 6 , . . . successively in terms of a,. Similarly, 
we can compute a 3 , fl 5 , a„ ... in terms of a,. For the coefficients with even subscripts, 
we find that 


a i — ‘ a / < 


The odd coefficients are 


a, = 0 • a 2 = 0 , a, = a 8 = a 10 = ■ . . = 0 


1 - 9 —1 

a 3 ~ i i t fl l ~ a 


l.(-D 


1 + 2 3 i, 3 + 2 q 2 

5-2 ° 4- ax = +4% 


3 


> 


_3_ . 


5 + 2 


7-5 


and, in general, 

a 


. 2n a- 3~2n - 3 2h - 5 2n - 7 
2n+1 2n + 1 2 " 2 ? i + 1 2/1 * — 1 2n - 3 


3 1, +JJ 
7 5 3 


a, . 


When the common factors are canceled, this simplifies to 

-1 


a2n+1 ~ (2n + l)(2n - 
Therefore, the series for y can be written as follows: 

y = »„(!-**) -a, 


• 




n=0 


(2n + l)(2n — 1) 
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The ratio test may be used to verify the convergence of this series for |a'| < 1. The work 
just carried out shows that the series actually satisfies the differential equation in (11.13), 
where a, and a, may be thought of as arbitrary constants. The reader should note that 
in this particular example the polynomial which multiplies a, is itself a solution of (11.13), 
and the series which multiplies a, is another solution. 

The procedure just described is called the method of undetermined coefficients. Another 
way to find these coefficients is to use the formula 


a,=/^°> if y = f(x) . 
nl 

Sometimes the higher derivatives of y at the origin can be computed directly from the 
differential equation. For example, setting x = 0 in (11.13) we immediately obtain 

f "(0) = -2f (0) = —2a 0 , 

and hence we have 


To find the higher derivatives, we differentiate the differential equation to obtain 
(11.16) (1 - x 2 )y"' - 2 xy" = -2y’ . 

Putting x = 0, we see that f ‘“(0) = —2f'(Q) = —2%, and hence a 3 = f "'(0)/3! = — fl x /3. 
Differentiation of (11.16) leads to the equation 

(1 _ x 2 )p <4 » - 4 xy'" = 0 . 

When x = 0, this yields f (4) (0) = 0, and hence fl 4 = 0. Repeating the process once more, 
we find 

(1 — x 2 )y {6} — 6 xy {i) — 4 y'" = 0 , 

/ <5> (0) = 4f "Y 0) = - 8a, , a * = f ~JT = -fl- 
it is clear that the process may be continued as long as desired. 


11.15 The binomial series 

We can also use our knowledge of differential equations to determine the sums of certain 
power series. For example, we shall use the existence-uniqueness theorem for first-order 
linear differential equations to prove that the binomial series expansion 



n - 0 


(11.17) 



442 


Sequences and series of functions 


is valid in the interval |x| < 1. Here the exponent a is an arbitrary real number and (“) 
denotes the binomial coefficient defined by 


a _ «(« - 1) ' . . (« - n + f) 
o a n I 

When a is a nonnegative integer, all but a finite number of the coefficients (“) are zero, and 
the series reduces to a polynomial of degree a, giving us the familiar binomial theorem. 
To prove (11.17) for an arbitrary real a, we first use the ratio test to find that the series 
converges absolutely in the open interval — 1 < X < 1. Then we define a function / by 
means of the equation 

CO / 

(U.19, „„ = J u 

n = 0 


jx" if |x| < 1. 


We then show thatfis a solution of the linear differential equation 

(11.20) Y f- y = 0 

and satisfies the initial condition f(0) = 1. Theorem 8.3 tells us that in any interval n °t 
containing the point x == -1 there is only one solution of this differential equation with 
y = 1 when x = 0. Since y = (1 -J- x) a is such a solution, it follows that f(x) = (1 + x) 
i f - 1 < x < 1 . 

Therefore, to prove (11.17) we need only show that f satisfies the differential equation 

(11.20) . For this purpose, we require the following property of the binomial coefficients: 

<“ + «(, + 1> (a " "»(») • 

This property, which is an immediate consequence of the definition in (11.18) holds for 
every real a and every integer n > 0. It can also be expressed in the form 

= 2> + + ik • 

n~ 0 

from which we find that 

a + *,/’(,> = j {<» + »(„ ; ,) + »(:))>• = * i (;)*• = «/<*) . 


(11.2D <» + 1)(„ * ! 

Differentiation of (11.19) gives us 

/• w = j 

W=1 



Exercises 


443 


because of (II .21). This shows thatf satisfies the differential equation (11.20) and this, in 
turn, proves (11.17). 


11.16 Exercises 


1 The differential equation (1 — x 2 )y" 2 xy' + 6y = 0 has a power-series solution f(x) = 

2*-o withf(0) = 1 and/'(0) = 0. Use the method of undetermined coefficients to obtain 
a recursion formula relating a n+2 to a, . Determine a, explicitly for each n and find the sum 
of the series. 

2. Do the same as in Exercise 1 for the differential equation (1 — x 2 )y” — 2 xy' + 12 y = 0 and 
the initial conditions /( 0) = 0, /'TO) = 2. 


In each of Exercises 3 through 9, the power series is used to define the function /. Determine 
the interval of convergence in each case and show thatfsatisfies the differential equation indicated, 
where y = fix). In Exercises 6 through 9, solve the differential equation and thereby obtain the 
sum of the series. 


in 

2 r" 

(50! 

oo 

2 X n 

wr 

n=u 


d*y 

dx 4 ~~ y ‘ 

xy" +/ - y = 0. 


v 1 . 4 . 7 • • ■ (3„ _ 2) „ 

5-/W = '+2 (3n)! 

«=1 V ’ 

y" = x a y + b. 

x 2n 

6. f(x) = y — ; y’ = 2xy. 

8. fix) =2- 

n=0 n ' 

w=0 

7- f(x) =2 y’ = x + y. 

x—> 0 n\ 

n = 2 

9. f(x) = x + 


(Find a and b.) 
(~l) n 2 2n x 2n 


(2»)! ’ 

C lV n+1 
^ n (2 n + 1)! ’ 


y" + 4y = 0. 
y" = 9 (y - x). 


10. The functions J Q and defined by the series 


J»(x) =2(-D" 


n = 0 


(n!) 2 2 2 ” 


-2 ‘-'U-sr 


■2n+l 


n=o 


!(« + 1 ) ! 2 2n+1 


are called Bessel functions of the first kind of orders zero and one, respectively. These functions 
arise in many problems in pure and applied mathematics. Show (a) both series converge 
for all real x; (b) J'fx) = -Jfx)\ (c) fix) = j'fx), where jfx) = xJfx) and/jU) = xJfx). 

11. The differential equation 

X 2 y" + xy’ + (x 2 n 2 )y = 0 


is called Bessel's equation. Show that J 0 and J x (as defined in Exercise 10) are solutions when 
n = 0 and 1, respectively. 

In each of Exercises 12, 13, and 14, assume the given differential equation has a power-series 
solution and find the first four nonzero terms. 

12. y' — x 2 + y 2 , with y = 1 when x = 0. 

13. y’ = 1 + xy 2 , with y = 0 when x = 0. 

14. y' = x + y 2 , with y =Owhenx =0. 
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In Exercises 15, 16, and 17, assume the given differential equation has a power-series solution of 
the form y = 2 u n x n , and determine the nth coefficient a, . 

15. y’ = ay. 16. y" = xy. 17. y” + xy’ + y = 0. 

18. Let f(x) = 2”=o w here a 0 = 1 and the remaining coefficients are determined by the 

identity 

00 

e - 2 x = ^ {2 a n + (n + 1 )a n+l }x n . 
n = o 

Compute a lr a 2 1 a 3 , and find the sum of the series for f(x). 

19. Let f(x) = J“_ 0 o n x n , where tlhe coefficients a, are determined by the relation 


cos x — 2 a n (n + 2)x n . 

n = o 

Compute a 5 , a„ and f(-rr). 

20. (a) Show that the first six terms of the binomial series for (1 — x)~ 1/2 are: 


1 3 5 35 , 63 

1+ 2' <+ 8 J:,+ i6' +128* + 256* S ' 


(b) Let a, denote the nth term of this series when x = 1/50, and let r n denote the remainder 
after n terms; that is, for n > 0 let 


r n — + a n + 3 + ' ■ ■ . 


Show that 0 < r n < a n jA9. 

[Hint: Show that a,+, < aj 50, and dominate r n by a suitable geometric series.] 
(c) Verify the identity 



- 1/2 


and use it to compute the first ten correct decimals of \/'2. 


[Hint: Use parts (a) and (b), retain twelve decimals during the calculations, 
and take into account round-off errors.] 


21. (a) Show that 


7 - 1732/ 176 

V 3 = 1000' 1 ~ 3,000/000 


- 1/2 


(b) Proceed as suggested in Exercise 20 and compute the first fifteen correct decimals of y/i , 

22. Integrate the binomial series for (1 _x 2 )~ 1/2 and thereby obtain the power-series expansion 


arcsin x 


1 ■ 3 • 5 • • • (2/z - 1) x 2b+1 
x -2-. 4-. 6 ■ (2n) 2n + 1 


(M < i). 


«= 1 
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VECTOR ALGEBRA 


12.1 Historical introduction 

In the foregoing chapters we have presented many of the basic concepts of calculus 
and have illustrated their use in solving a few relatively simple geometrical and physical 
problems. Further applications of the calculus require a deeper knowledge of analytic 
geometry than has been presented so far, and therefore we turn our attention to a more 
detailed investigation of some fundamental geometric ideas. 

As we have pointed out earlier in this book, calculus and analytic geometry were 
intimately related throughout their historical development. Every new discovery in one 
subject led to an improvement in the other. The problem of drawing tangents to curves 
resulted in the discovery of the derivative; that of area led to the integral; and partial 
derivatives were introduced to investigate curved surfaces in space. Along with these 
accomplishments came other parallel developments in mechanics and mathematical 
physics. In 1788 Lagrange published his masterpiece Mecanique analytique (Analytical 
Mechanics) which showed the great flexibility and tremendous power attained by using 
analytical methods in the study of mechanics. Later on, in the 19th Century, the Irish 
mathematician William Rowan Hamilton (1805-1 865) introduced his Theory of Quaternions, 
a new method and a new point of view that contributed much to the understanding of both 
algebra and physics. The best features of quaternion analysis and Cartesian geometry were 
later united, largely through the efforts of J. W. Gibbs (1839-1903) and 0. Heaviside 
(1850-1925), and a new subject called vector algebra sprang into being. It was soon realized 
that vectors are the ideal tools for the exposition and simplification of many important 
ideas in geometry and physics. In this chapter we propose to discuss the elements of vector 
algebra. Applications to analytic geometry are given in Chapter 13. In Chapter 14 vector 
algebra is combined with the methods of calculus, and applications are given to both 
geometry and mechanics. 

There are essentially three different ways to introduce vector algebra: geometrically, 
analytically, and axiomatically. In the geometric approach, vectors are represented by 
directed line segments, or arrows. Algebraic operations on vectors, such as addition, 
subtraction, and multiplication by real numbers, are defined and studied by geometric 
methods. 

In the analytic approach, vectors and vector operations are described entirely in terms 
of numbers, called components. Properties of the vector operations are then deduced from 
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corresponding properties of numbers. The analytic description of vectors arises naturally 
from the geometric description as soon as a coordinate system is introduced. 

In the axiomatic approach, no attempt is made to describe the nature of a vector or of 
the algebraic operations on vectors. Instead, vectors and vector operations are thought 
of as undefined concepts of which we know nothing except that they satisfy a certain set of 
axioms. Such an algebraic system, with appropriate axioms, is called a linear space or a 
linear vector space Examples of linear spaces occur in all branches of mathematics, and 
we will study many of them in Chapter 15. The algebra of directed line segments and the 
algebra of vectors described by components are merely two examples of linear spaces. 

The study of vector algebra from the axiomatic point of view is perhaps the most 
mathematically satisfactory approach to use since it furnishes a description of vectors that 
is free of coordinate systems and free of any particular geometric representation. This 
study is carried out in detail in Chapter 15. In this chapter we base our treatment on the 
analytic approach, and we also use directed line segments to interpret many of the results 
geometrically. When possible, we give proofs by coordinate-free methods. Thus, this 
chapter serves to provide familiarity with important concrete examples of vector spaces, 
and it also motivates the more abstract approach in Chapter 15. 


12.2 The vector space of n-tuples of real numbers 

The idea of using a number to locate a point on a line was known to the ancient Greeks. 

In 1637 Descartes extended this idea, using a pair of numbers (a, , a 2 ) to locate a point in 
the plane, and a triple of numbers (a„ a 2 , a 3 ) to locate a point in space. The 19th Century 
mathematicians A. Cayley (1821-1895) and H. G. Grassmann (1809-1877) realized that 
there is no need to stop with three numbers. One can just as well consider a quadruple of 
numbers (a„ fl 2 , a 3 , a 4 ) or, more generally, an n-tuple of real numbers 

(hh ) ^2 > ■ ■ • ) ® n ) 

for any integer n > 1. Such an n-tuple is called an n- dimensional point or an n-dimensional 
vector, the individual numbers a„ a i , . . . , a, being referred to as coordinates or components 
of the vector. The collection of all ^-dimensional vectors is called the vector space of 
n-tuples, or simply n-space. We denote this space by V n . 

The reader may well ask at this stage why we are interested in spaces of dimension 
greater than three. One answer is that many problems which involve a large number of 
simultaneous equations are more easily analyzed by introducing vectors in a suitable 
«-space and replacing all these equations by a single vector equation. Another advantage 
is that we are able to deal in one stroke with many properties common to 1-space, 2-space, 
3-space, etc., that is, properties independent of the dimensionality of the space. This 
is in keeping with the spirit of modern mathematics which favors the development of 
comprehensive methods for attacking problems on a wide front. 

Unfortunately, the geometric pictures which are a great help in motivating and illustrating 
vector concepts when n =1,2, and 3 are not available when n > 3 ; therefore, the study 
of vector algebra in higher-dimensional spaces must proceed entirely by analytic means, 

In this chapter we shall usually denote vectors by capital letters A, B, C, , and 
components by the corresponding small letters a, b, c, ... . Thus, we write 


A — (a,, &%,•••, a,) , 
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To convert v n into an algebraic system, we introduce equality of vectors and two vector 
operations called addition and multiplication by scalars. The word “scalar” is used here as 
a synonym for “real number.” 

definition. Two vectors A and B in V n are called equal whenever they agree in their 

respective components. That is, if A = (a,, a 2 , ■ ■ ■ , a,) and B = (b, , b 2 b,), the vector 

equation A = B means exactly the same as the n scalar equations 

a i = b 1 t a 2 = b 2 , . . . . a n = b n . 

The sum A + B is defined to be the vector obtained by adding corresponding components: 

A + B — (a x + b x , a 2 + b 2 , . . . , a n + b n ) . 

If c is a scalar, we define cA or Ac to be the vector obtained by multiplying each component 
of A by c: 

cA = ( ca 1 , ca,, .... ca,). 

From this definition it is easy to verify the following properties of these operations. 

THEOREM 12.1. VectOf addition is commutative, 

A + B = B + A , 

and associative, 

A + (B + C) = (A + B) + C. 

Multiplication by scalars is associative, 


c(dA) = (cd)A 


and satisfies the two distributive laws 

c(A + B) = cA -f cB , and (c + d)A = cA + dA . 

Proofs of these properties follow quickly from the definition and are left as exercises for 
the reader. 

The vector with all components 0 is called the zero vector and is denoted by 0. It has 
the property that A + 0 = A for every vector A; in other words, 0 is an identity element 
for vector addition. The vector (- l)A is also denoted by -A and is called the negative 
of A. We also write A — B for A + (-B) and call this the difference of A and B. The 
equation (A + B) — B = A shows that subtraction is the inverse of addition. Note that 
(L4 = 0 and that l A = A. 

The reader may have noticed the similarity between vectors in 2-space and complex 
numbers. Both are defined as ordered pairs of real numbers and both are added in exactly 
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the Same way. Thus, as far as addition is concerned, complex numbers and two-dimensional 
vectors are algebraically indistinguishable. They differ only when we introduce multiplica- 
tion. 

Multiplication of complex numbers gives the complex-number system the field properties 
also possessed by the real numbers. It can be shown (although the proof is difficult) that 
except for n = 1 and 2, it is not possible to introduce multiplication in V n so as to satisfy 
all the field properties. However, special products Can be introduced in V n which do not 
satisfy all the field properties. For example, in Section 12.5 we shall discuss the dotproduct 
of two vectors in V n . The result of this multiplication is a scalar, not a vector. Another 
product, called the cross product, is discussed in Section 13.9. This multiplication is 
applicable only in the space V 3 . The result is always a vector, but the cross product is 
not commutative. 

12.3 Geometric interpretation for n < 3 

Although the foregoing definitions are completely divorced from geometry, vectors and 
vector operations have an interesting geometric interpretation for spaces of dimension 
three or less. We shall draw pictures in 2-space to illustrate these concepts and ask the 
reader to produce the corresponding visualizations for himself in 3-space and in 1 -space. 



Figure 12.1 The geometric vector Figure 12.2 AB and CD are equivalent 

AB from A to B. because B -A = D ~C. 

A pair of points A and B is called a geometric vector if one °f the points, say A, is called 
the initialpoint and the other, B, the terminalpoint, or tip. \\/ e visualize a geometric vector 
as an arrow from A to B. as shown in Figure 12.1, and denote it by the symbol AB. 

Geometric vectors are especially convenient for representing certain physical quantities 
such as force, displacement, velocity, and acceleration, which possess both magnitude and 
direction. The length of the arrow is a measure of the magnitude and the arrowhead 
indicates the required direction. 
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Suppose we introduce a coordinate system with origin 0. Figure 12.2 shows two geo- 
metric vectors AB and CD with B — A = I) — C. In terms of components, this means 
that we have 

b 1 — a, = d x — Ci and b 2 — a 2 = d 2 — c 2 . 

By comparison of the congruent triangles in Figure 12.2, we see that the two arrows 
representing AB and CD have equal lengths, are parallel, and point in the same direction. 
We call such geometric vectors equivalent. That is, we say AB is equivalent to CD whenever 

(12.1) B - A = D - C . 

Note that the four points A, B, C, D are vertices of a parallelogram. (See Figure 12.3.) 
Equation (12.1) can also be written in the form A + D = B + C which tells us that 
opposite vertices of theparallelogram have the same sum. In particular, if one of the vertices, 
say A, is the origin 0, as in Figure 12.4, the geometric vector from 0 to the opposite vertex 
D corresponds to the vector sum D = B + C. This is described by saying that vector 
addition corresponds geometrically to addition of geometric vectors by the parallelogram 
Zaw. The importance of vectors in physics stems from the remarkable fact that many 
physical quantities (such as force, velocity, and acceleration) combine by the parallelogram 
law. 



Figure 12.3 Opposite vertices of Figure 12.4 Vector addition interpreted 

a parallelogram have the same sum : geometrically by the parallelogram law. 

A + D = B + C . 



For simplicity in notation, we shall use the same symbol to denote a point in V„ (when 

n < 3) and the geometric vector from the origin to this point. Thus, we write A instead of 
— ) ■ ' ^ 

OA, B instead of OB, and so on. Sometimes we also write A in place of any geometric 
vector equivalent to OA. F° r example, Figure 12.5 illustrates the geometric meaning of 
vector subtraction. Two geometric vectors are labeled as 5 — A, but these geometric vectors 
are equivalent. They have the same length and the same direction. 

Figure 12.6 illustrates the geometric meaning of multiplication by scalars. If B = cA, 
the geometric vector B has length |c| times the length of A; it points in the same direction 
as A if c is positive, and in the opposite direction if c is negative. 
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Figure 12.5 Geometric meaning of subtraction of Figure 12.6 Multiplication of 
vectors. vectors by scalars. 


The geometric interpretation of vectors in V„ for n < 3 suggests a way to define 
parallelism in a general w-space. 

definition . Two vectors A and B in V„ are said to have the same direction if B = cA 
for some positive scalar c, and the opposite direction if B = cAfor some negative c. They are 
called parallel if B = cAfor some nonzero c. 

Note that this definition makes every vector have the same direction as itself-a property 
which we surely want. Note also that this definition ascribes the following properties to 
the zero vector: The zero vector is the only vector having the same direction as its negative 
and therefore the only vector having the opposite direction to itself. The zero vector is the 
only vector parallel to the zero vector. 

12.4 Exercises 

1. Let A = (1, 3, 6), B = (4, -3, 3), and C = (2, 1, 5) be three vectors in V 3 , Determine the 
components of each of the following vectors: (a) A + B; (b) A — B; (c) A + B ™ C; (d) 
7 A -2B - 3C; (e) 2A + B - 3C. 

2. Draw the geometric vectors from the origin to the points A = (2, 1) and B = (1, 3). On the 
same figure, draw the geometric vector from the origin to the point C =A + tB for each of the 
following values of t:t= J; t = J; t = £; / = 1 ; t = 2; t = — 1; t = —2. 

3. Solve Exercise 2 if C = tA + B. 

4. Let A - (2, 1), B = (1, 3), and C = xA + yB, where x and y are scalars. 

(a) Draw the geometric vector tfom the origin to C for each of the following pairs of values of 

x&ndy-.x =y = i; x = J, y = f ; x = %,y = §; x = 2, y = -l;x = 3, y = -2;x= 

y = f ; x = -\,y =2. 

(b) What do you think is the set of points C obtained as x and y run through all real numbers 
such that x + y = 1? (Just make a guess and show the locus on the figure. No proof is 
required.) 

(c) Make a guess for the set of all points C obtained as x and y range independently over the 

intervals 0 < x ^ 1, 0 < y < 1, and make a sketch of this set. 

(d) What do you think is the set of all C obtained if x ranges through the interval 0 < x < 1 
and y ranges through all real numbers? 

(e) What do you think is the set if x and y both range over all real numbers? 

5. Let A = (2, 1) and B = (1, 3). Show that every vector C = (q , ca) in V 2 can be expressed in 
the form C = xA + yB. Express x and y in terms of q and ca. 
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6. Let A = (1, 1, 1), B = (0, l, 1), and C = (1, 1, 0) be three vectors in V 2 and let D = xA + 
yB + zC, where x, y, z are scalars. 

(a) Determine the components of D . 

(b) If D = 0, prove that x = y = z = 0. 

(c) Find x, y, z such that D =(1,2, 3). 

7. Let A = (1, 1, 1), 8 = (0, 1, 1) and C = (2, 1, 1) be three vectors in V 3 , and let D = xA + 
yB + zC, where x, y, and z are scalars. 

(a) Determine the components of D . 

(b) Find x, y, and z, not all zero, such that D = 0. 

(c) Prove that no choice of x, y, z makes D =(1,2, 3). 

8. Let A = (1, 1, 1,0), B = (0, 1, 1, 1), C = (1, 1, 0, 0) be three vectors in V 4 , and let D = 
xA + }’B + zC, where x, y, and z are scalars. 

(a) Determine the components of D . 

(b) If D = 0, prove that x = y = z = 0. 

(c) Find x, y, and z such that D = (1, 5, 3, 4). 

(d) Prove that no choice of x, y, z makes D = (1, 2, 3, 4). 

9. In V n , prove that two vectors parallel to the same vector are parallel to each other. 

10. Given four nonzero vectors A, 6 , C, D in V„ such that C = A + 6 and A is parallel to D . 
Prove that C is parallel to D if and only if B is parallel to D . 

11. (a) Prove, for vectors inV n , the properties of addition and multiplication by scalars given in 
Theorem 12.1. 

(b) By drawing geometric vectors in the plane, illustrate the geometric meaning of the two 
distributive laws (c + d)A = cA + dA and c(A + B) = cA + cB, 

12. If a quadrilateral OABC in V 2 is a parallelogram having A and C as opposite vertices, prove 
that A + i(C — A) = \B. What geometrical theorem about parallelograms can you deduce 
from this equation? 


12.5 The dot product 

We introduce now a new kind of multiplication called the dot product or scalar product 
of two vectors in V n . 

definition . zf A = (a„ . . . , a,) and B = (b, , . . . , b,) are two vectors in V n , their dot 
product is denoted by A B and is defined by the equation 

n 

A ■ 8 =2 a k b k . 

k = 1 

Thus, to compute A . B we multiply corresponding components of A and 8 and then 
add all the products. This multiplication has the following algebraic properties. 


THEOREM 12.2. For all vectors A, B, 
properties: 

(a) A • B — B • A 

(b) A ' (B + C) = A ■ B + A C 

(c) c(A ■ B) = {cA) B = A ' ( cB ) 

(d) A . A > 0 if A^O 

(e) A < A = 0 if A = O. 


in V n and all scalars c, we have the following 

(commutative law), 

(distributive law), 

(homogeneity), 

(positivity), 
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Proof. The first three properties are easy consequences of the definition and are left 
as exercises. To prove the last two, we use the relation A > A = ^ cr k . Since each term is 
nonnegative, the sum is nonnegative. Moreover, the sum is zero if and only if each term 
in the sum is zero and this can happen only if A = 0. 

The dot product has an interesting geometric interpretation which will be described in 
Section 12.9. Before we discuss this, however, we mention an important inequality con- 
cerning dot products that is fundamental in vector algebra. 

THEOREM 12.3. THE CAUCHY-SCHWARZ INEQUALITY. If A (WtlB Cl VC VCCtOf'S Ul V n , W€ 

have 

(12.2) (A . B) 2 < (A . A)(B . B) . 

Moreover, tire equality sign holds ifand only ifone of the vectors is a scalar multiple of the 
other. 

Proof. Expressing each member of (12.2) in terms of components, we obtain 

«:)(£«)• 

\jc~r / \*=i / \k= l ’ 

which is the inequality proved earlier in Theorem 1.41. 

We shall present another proof of (12.2) that makes no use of components. Such a proof 
is of interest because it shows that the Cauchy-Schwarz inequality is a consequence of the 
five properties of the dot product listed in Theorem 12.2 and does not depend on the 
particular definition that was used to deduce these properties. 

To carry out this proof, we notice first that (12.2) holds trivially if either A or B is the 
zero vector. Therefore, we may assume that both A and B are nonzero. Let C be the vector 

C = xA — yB , where x = B < B and y — A • B . 

Properties (d) and (e) imply that C > C > 0. When we translate this in terms of x and y, 
it will yield (12.2). To express C > C in terms of x and y, we use properties (a), (b) and (c) 
to obtain 


C • C = (xA — yB ) . (xA - yB) = x\A . A) - 2 xy(A . B) + f(B ■ B) . 

Using the definitions of x and y and the inequality C 1 C > 0, we get 

(B . B)\A • A) - 2 (A • B)\B ■ B) + (A . Bf(B -B) >0. 

Property (d) implies B . B > 0 since B ^ 0, so we may divide by (B 1 B ) to obtain 

(B ■ B)(A . A) — (A . Bf> 0 , 

which is (12.2). This proof also shows that the equality sign holds in (12.2) if and only 
if C = 0. But C = 0 if and only if xA = yB. This equation holds, in turn, if and only if 
one of the vectors is a scalar multiple of the other. 




Length or norm of a vector 


453 


The Cauchy-Schwarz inequality has important applications to the properties of the 
length or norm of a vector, a concept which we discuss next. 


12.6 Length or norm of a vector 

Figure 12.7 shows the geometric vector from the origin to a point A = (a„ a 2 ) in the 
plane. From the theorem of Pythagoras, we find that the length of A is given by the 
formula 

length of A = a\+ a\ . 



A corresponding picture in 3-space is shown in Figure 12.8. Applying the theorem of 
Pythagoras twice, we find that the length of a geometric vector A in 3-space is given by 


length of A = ^ a\ + al + a\ . 

Note that in either case the length of A is given by (A . A) 11 *, the square root of the dot 
product of A with itself. This formula suggests & way to introduce the concept of length 
in «-space. 


DEFINITION. 

the equation 


ZfA is a vector in V n , its length or norm is denoted by || A |[ and is dqined by 

Ml! = (a . Ay*. 


The fundamental properties of the dot product lead to corresponding properties of norms. 


theorem 12.4. Zf A is a vector in V n and if c is a scalar, we have the following properties: 

(a) II A || > 0 if AjZ 0 (positivity), 

(b) Mil = 0 if A = 0, 

(c) Ml = |c| M|; (homogeneity). 
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Proof. Properties (a) and (b) follow at once from properties (d) and (e) of Theorem 
12.2. To prove (c), we use the homogeneity property of dot products to obtain 

I \cA || = (CA . cAf 2 = (c 2 A ■ A) 1 ' 2 = {c 2 fi\A ■ Af' 2 - |c| Mil . 

The Cauchy-Schwarz inequality can also be expressed in terms of norms. It states that 

(12.3) (A ' Bf < \\A || 2 Mil 2 - 

Taking the positive square root of each member, we can also write the Cauchy-Schwarz 
inequality in the equivalent form 

(12.4) \A ■ B\ < M|| ||*|| . 

Now we shall use the Cauchy-Schwarz inequality to deduce the triangle inequality. 

THEOREM 12.5. TRIANGLE INEQUALITY. If A 011(1 B Ore VCClOfS 111 V n , WC lldVC 

M + *11 ^ Mil + MU . 

Moreover, the equality sign holds if and only if A = 0, or B = 0, or B = cA for some 
C>0. 

Proof To avoid square roots, we write the triangle inequality in the equivalent form 

(12-5) \\A +*ir<(IM II + IMII) 2 . 

The left member of (12.5) is 

M +*ll 2 =(A +B)'(A + B) = A ' A + 2A . B + B . B = \\A \\z + 2A . B + || B\\ 2 , 
whereas the right member is 

(MU + Mil) 2 = Mil 2 +2MII Mil + Mil 2 • 

Comparing these two formulas, we see that (12.5) holds if and only if we have 
(12.6) A'B<\\A || Mil . 

But A > B < \A 1 B\ so (12.6) follows from the Cauchy-Schwarz inequality, as expressed in 

(12.4) . This proves that the triangle inequality is a consequence of the Cauchy-Schwarz 
inequality. 

The converse statement is also true. That is, if the triangle inequality holds then (12.6) 
also holds for A and for -A, from which we obtain (12.3). If equality holds in (12.5), then 
A ■ B = Mil Mll> so B = cA for some scalar c. Hence A ■ B = cMII 2 an d Mil Mil = 
|c| Mil 2 - If A ^ 0 this implies c = |c| > 0. If B ^ 0 then B = cA with c > 0. 
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The triangle inequality is illustrated geometrically in Figure 12.9. It states that the 
length of one side of a triangle does not exceed the sum of the lengths of the other two 
sides. 


12.7 Orthogonality of vectors 

In the course of the proof of the triangle inequality (Theorem 12.5), we obtained the 
formula 

(12.7) \\A + B\\* = Mil 2 + \\B \\ 2 + 2A ■ B 



Figure 12.9 Geometric meaning of the 
triangle inequality : 

\\A + B\\<,\\A I + I15II . 


Figure 12.10 Two perpendicular 
vectors satisfy the Pythagorean 
identity : 

\\A + 5j| 2 = \\A t| 2 + ||5|| 2 . 


which is valid for any two vectors A and B in V n ■ Figure 12.10 shows two perpendicular 
geometric vectors in the plane. They determine a right triangle whose legs have lengths 
((A(( and ||5|| and whose hypotenuse has length ((A + 5||. The theorem of Pythagoras 
states that 


\\A + Ufz \\A ||* + I) I? 1| 2 . 

Comparing this with (12.7), we see that A < B = 0. In other words, the dot product of two 
perpendicular vectors in the plane is zero. This property motivates the definition of per- 
pendicularity of vectors in V n ■ 


definition. Two vectors A and B in V n are calledperpendicular or orthogonal if A < B ;= 0. 


Equation (12.7) shows that two vectors A and B in y are orthogonal if and only if 
| A + B || 2 = I A || 2 + || B || 2 . This is called the Pythagorean identity in V n . 
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12.8 Exercises 

1. Let A — (1, 2, 3, 4), B =( — 1,2, —3, 0), and C = (0, 1, 0, 1) be three vectors in K 4 . Compute 
each of the following dot products : 

(a) A . B; (b) B ■ C; (c) A ■ C; (d) A . (B + C); (e) (A - B) • C. 

2. Given three vectors A = (2, 4, —7), B = (2, 6, 3), and C = (3, 4, -5). In each of the following 
there is only one way to insert parentheses to obtain a meaningful expression. Insert paren- 
theses and perform the indicated operations. 

(a) A • BC; (b) A . B + C; (c) A + B . C; (d) AB . C; (e) AjB ■ C. 

3. Prove or disprove the following statement about vectors in V„ : If A . B = A . C and A ^ 0, 

then B = C. 

4. Prove or disprove the following statement about vectors in V n : If A .6 = 0 for every B, then 
A = 0. 

5. If A = (2, 1, -l)and£ = (1, — 1, 2), find a nonzero vector C in V 3 such that AC = BC = °- 

6. If A = (1, -2, 3) and B = (3, 1, 2), find scalars x and y such that C = xA + yB is a nonzero 

vector with C • B = 0. 

7. If A = (2, — 1, 2) and B = (1, 2, —2), find two vectors C and D in V 2 satisfying all the follow- 
ing conditions: A = C + D, B .£> = 0, C parallel to B. 

8. If A = (1, 2, 3, 4, 5) and B = (1, J, f), find two vectors C and D in V 5 satisfying all the 

following conditions: B = C + 2D, D . A = 0, C parallel to A. 

9. Let A = (2, -1, 5), B = (-1, -2, 3), and C = (1, -1, 1) be three vectors in V 3 . Calculate 

the norm of each of the following vectors: 

(a) A + B; (b) A — B; (c) A + B - C; (d ) A - B + C. 

10. In each case, find a vector B in V 2 such that B A = 0 and ||B]| = || A ]l if: 

(a) A = (1, 1); (b) A = (1, -1); (c) A = (2, -3); (d) A = (a, b). 

11. Let A = (1, —2, 3) and B = (3, 1,2) be two vectors in V 3 . In each case, find a vector C of 
length 1 parallel to: 

(a) A -t B; ( b )A -B; (c) A + 2B ; (d) A - 2B; (e) 2A - B. 

12. Let A = (4, 1, -3), B = (1, 2, 2), C = (1, 2, -2), D = (2, 1, 2), and E = (2, -2, -1) be 
vectors in V 3 . Determine all orthogonal pairs. 

13. Find all vectors in V 2 that are orthogonal to A and have the same length as A if: 

(a) A = (1,2); (b) A = (1, -2); (c) A = (2, -1); (d) A = (-2, 1). 

14. If A = (2, -1, 1) and B = (3, -4, —4), find a point C in 3-space such that A, B, and C are 
the vertices of a right triangle. 

15. If A = (1, —1, 2) and B = (2, 1, —1), find a nonzero vector C in V 3 orthogonal to A and B. 

16. Let A = (1, 2) and B = (3, 4) be two vectors in V 2 ■ Find vectors P and Q in V 2 such that 
A = P + Q,P is parallel to B, and Q is orthogonal to B. 

17. Solve Exercise 16 if the vectors are in F 4 , with A = (1, 2, 3, 4) and B = (1, 1, 1, 1). 

18. Given vectors A = (2, -1, 1), B = (1, 2, —1), and C = (1, 1, -2) in V 3 . Find every vector 
D of the form xB + yC which is orthogonal to A and has length 1 . 

19. Prove that for two vectors A and B in y we have the identity 

\\A + Bf - || A - 5 1| 2 = 4A . B, 

and hence A . B = 0 if and only if || A + B\\ = \\A — B ||. When this is interpreted geo- 
metrically in V 2 , it states that the diagonals of a parallelogram are of equal length if and only if 
the parallelogram is a rectangle. 

20. Prove that for any two vectors A and B in V n we have 

\\A + B\\ 2 + \\A — B\\ 2 =2 \\A || 2 + 2 ||5|| 2 . 

What geometric theorem about the sides and diagonals of a parallelogram can you deduce 
from this identity? 
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21. The following theorem in geometry suggests a vector identity involving three vectors A, B. 
and C. Guess the identity and prove that it holds for vectors in V n . This provides a proof of the 
theorem by vector methods. 

‘The sum of the squares of the sides of any quadrilateral exceeds the sum of the squares of 
the diagonals by four times the square of the length of the line segment which connects the 
midpoints of the diagonals.” 

22. A vector A in V„ has length 6. A vector B in V n has the property that for every pair of scalars 
x and y the vectors xA + }'B and 4 yA — 9 xB are orthogonal. Compute the length of Band 
the length of 2A + 3B. 

23. Given two vectors A = (1, 2, 3, 4, 5) and B = (1, Jr, J, |) in V 5 . Find two vectors C and D 
satisfying the following three conditions: C is parallel to A, D is orthogonal to A, and B = 
C + D. 

24. Given two nonperpendicular vectors A and B in y prove that there exist vectors C and D 
in V n satisfying the three conditions in Exercise 23 and express C and D in terms of A and B. 

25. Prove or disprove each of the following statements concerning vectors in V n : 

(a) If A is orthogonal to B, then || A + xB\\> || A || for all real x. 

(b) If || A + xB !! > ||A || for all real x, then A is orthogonal to B. 


12.9 Projections. Angle between vectors in n-space 

The dot product of two vectors in V 2 has an interesting geometric interpretation. Figure 
12.1 1(a) shows two nonzero geometric vectors A and B making an angle 6 with each other. 
In this example, we have 0 < 0 < \tt. Figure 12.1 1(b) shows the same vector A and two 
perpendicular vectors whose sum is A One of these, tB, is a scalar multiple of B which we 
call the projection of A along B. In this example, t is positive since 0 < 6 < \tt. 




Figure 12.11 The vector tB is the projection of A along B. 

We can use dot products to express t in terms of A and B. First we write tB + C = A 
and then take the dot product of each member with B to obtain 


tB-B+CB = AB. 


But C ■ B = 0, because C was drawn perpendicular to B. Therefore tB 1 B = A . B. so 
we have 
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On the other hand, the scalar t bears a simple relation to the angle 0. From Figure 12.1 1(b), 
we see that 


COS 8 -= 


\\tB\\_ t\\B 

Mil Mil 


Using (12.8) in this formula, we find that 


(12.9) 

or 


COS 6 = 


A-B 

A\\ Mil 


A ■ B = \\A || ||fi|| cos 0 . 


In other words, the dot product of two nonzero vectors A and B in V 2 is equal to the prod- 
uct of three numbers: the length of A, the length of B, and the cosine of the angle between 
A and B. 

Equation (12.9) suggests a way to define the concept of angle in V n . The Cauchy- Schwarz 
inequality, as expressed in (12.4), shows that the quotient on the right of (12.9) has absolute 
value < 1 for any two nonzero vectors in V . In other words, we have 


- 1 < 


A • B 

Mil || B 


< 1 • 


Therefore, there is exactly one real 0 in the interval 0 < 0 < 77 such that (12.9) holds. We 
define th e angle between A and B to be this 0. The foregoing discussion is summarized in 
the following definition. 


definition. Let A and B be two vectors in V n , with B ^ 0. The vector tB, where 


A • B 


BB’ 


is called the projection of A along B. 
and B is defined by the equation 

6 


I f 


both A and B are nonzero, the angle 0 between A 


= arccos 


A • B 

A\\ MU ' 


Note: The arc cosine function restricts 0 to the interval 0 < 0 < 7 r. Note also that 

0=^77 when A . B = 0. 


12.10 The unit coordinate vectors 

In Chapter 9 we learned that every complex number (a, b) can be expressed in the form 
a + hi, where i denotes the complex number (0, 1). Similarly, every vector (a, b) in V 2 
can be expressed in the form 


(a, b) = a( 1, 0 ) + b( 0 , 1 ) . 
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The two vectors (1, 0) and (0, 1) which multiply the components a and b are called unit 
coordinate vectors. We now introduce the corresponding concept in V n . 

definition. In V„ , the n vectors E 1 = (1, 0, . . . , 0), £2 = (0, 1 , 0, . . . , 0 E n = 

(0, 0, . . . , 0. 1) are called the unit coordinate vectors. It is understood that the kth component 
of E k is 1 and all other components are 0. 

The name “unit vector” cornes from the fact that each vector E k has length 1 . Note that 
these vectors are mutually orthogonal, that is, the dot product of any two distinct vectors 
is zero, 

E k - Ej = 0 if k ^ j . 

theorem 12.6. Every vector X = (.Vj , . . . , x,) in V n can be expressed in the form 

n 

X = x l E l + . . . + x„E„ = 2 x k E k . 

k=l 

Moreover, this representation is unique. That is, if 

n 

X = ^ x k E t , c and 

k=l 

then x k = y, for each k = 1, 2, .... n. 

Proof. The first statement follows immediately from the definition of addition and 
multiplication by scalars. The uniqueness property follows from the definition of vector 
equality. 

A sum of the type ^ c i A i is called a linear combination of the vectors A A,. 

Theorem 12.6 tells us that every vector in V n can be expressed as a linear combination of 
the unit coordinate vectors. We describe this by saying that the unit coordinate vectors 
E t , ... ,E n span the space V n . We also say they span V„ uniquely because each representa- 
tion of a vector as a linear combination of E x , ... , E r is unique. Some collections of 
vectors other than E t , . . . , E n also span V n uniquely, and in Section 12.12 we turn to the 
study of such collections. 

In V 2 the unit coordinate vectors E x and E 2 are often denoted, respectively, by the 
symbols i and j in bold-face italic type. In V 3 the symbols i, j, and k are also used in place 
of E x , E., , E 3 . Sometimes a bar or arrow is placed over the symbol, for example, \ or /'. 
The geometric meaning of Theorem 12.6 is illustrated in Figure 12.12 for n = 3. 

When vectors are expressed as linear combinations of the unit coordinate vectors, 
algebraic manipulations involving vectors can be performed by treating the sums ^x k E k 
according to the usual rules of algebra. The various components can be recognized at any 
stage in the calculation by collecting the coefficients of the unit coordinate vectors. For 
example, to add two vectors, say A = (a,, ..., a,) and B = (b, , . . . , b,), we write 

n n 

A = 2 a * E * • B = 2 b k E k , 

fc=l k=l 


X= 1 lA* 

k=l 
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figure 12.12 A vector A in V$ expressed as a linear combination of j, k. 


and apply the linearity property of finite sums to obtain 


A + B - 2 a k^k 2 ^k^lc - 2 ( a k "*■ bk)Ek ■ 

k=l k=l k=l 

The coefficient of E k on the right is the kth component of the sum A + B. 

12.11 Exercises 

1. Determine the projection of A along B if A = (1, 2, 3) and B = (1, 2, 2). 

2. Determine the projection of A along B if A = (4, 3, 2, 1) and B = (1, l, 1, 1). 

3. (a) Let A = (6, 3, —2), and let a, b, c denote the angles between A and the unit coordinate 
vectors ij, k, respectively. Compute cos a, cos fa, and cos c. These are called the direction 
cosines of A 

(b) Find all vectors in V 3 of length 1 parallel to A 

4. Prove that the angle between the two vectors A = (1, 2, 1) and B = (2, 1, — 1) is twice that 
between C = (1, 4, 1) and D = (2, 5, 5). 

5. Use vector methods to determine the cosines of the angles of the triangle in 3-space whose 
vertices are at the points (2, -1, 1), (1, -3, —5), and (3, -4, -4). 

6. Three vectors A, B, C in V 3 satisfy all the following properties: 

II A || = ||C|| = 5, ||£|| = 1 , \\A - B + CI1 = \\A + B + CII. 

If the angle between A and B is w/8, find the angle between B and C. 

7. Given three nonzero vectors A, B, C in V n . Assume that the angle between A and C is equal to 
the angle between B and C. Prove that C is orthogonal to the vector ||jJ||A — ||A B. 

8. Let 0 denote the angle between the following two vectors in V n : A = (1, 1, . . . , 1) and B = 
(1, 2, c i . , n). Find the limiting value of 0 as n -> oo. 

9. Solve Exercise 8 if A = (2, 4, 6, ... , 2 n) and B = (1, 3, 5, . . . , In — 1). 
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10. Given vectors A = (cos 8, -sin 6) and B = (sin 8, cos 8 ) in V 2 . 

(a) Prove that A and B are orthogonal vectors of length 1 . Make a sketch showing A and B 
when 6 = 7r/6. 

(b) Find all vectors (x, y) in V 2 such that (x, y) = xA + yB. Be sure to consider all possible 
values of 8, 

11. Use vector methods to prove that the diagonals of a rhombus are perpendicular. 

12. By forming the dot product of the two vectors (cos a, sin a) and (cos b , sin b), deduce the 
trigonometric identity cos (a — b)= cos a cos b + sin a sin b, 

13. If 8 is the angle between two nonzero vectors A and Bin V n , prove that 

\\A -5f= || A || 2 + II5II 2 - 2 \\A || 11*1 cos 8. 

When interpreted geometrically in V 2 , this is the law of cosines of trigonometry. 

14. Suppose that instead of defining the dot product of two vectors A = (a^ , . . . , a,) and B = 

, . . . , b n ) by the formula A.B = ]£jf =1 aj>k • we used the following definition : 

A . B = | |«A! ■ 

k = 1 

Which of the properties of Theorem 12.2 are valid with this definition? Is the Cauchy-Schwarz 
inequality valid with this definition? 

15. Suppose that inV 2 we define the dot product of two vectors A = (a-, , a 2 ) and B = (b, , b 2 )by 
the formula 

A • B — ^.iiyb ^ + Qob 3 + cJ ) ] . 

Prove that all the properties of Theorem 12.2 are valid with this definition of dot product. Is 
the Cauchy-Schwarz inequality still valid? 

16. Solve Exercise 15 if the dot product of two vectors A = (a x , o 2 . a 3 ) and B = (b, ,b 2 , b 3 ) in V 3 
is defined by the formula A . B = 2a l b 1 -f- a 2 b 2 + a 3 b 3 + a x b 3 + a 3 b l . 

17. Suppose that instead of defining the norm of a vector A = (a x , . . . , a,) by the formula 
(A . A) 1/2 , we used the following definition : 


wa n=i>*i ■ 

k=l 


(a) Prove that this definition of norm satisfies all the properties in Theorems 12.4 and 12.5. 

(b) Use this definition in V 2 and describe on a figure the set of all points (x, y) of norm 1. 

(c) Which of the properties of Theorems 12.4 and 12.5 would hold if we used the definition 


u 



? 


18. Suppose that the norm of a vector A = (a 1 , . . . , a,) were defined by the formula 


|| A II = max laj , 

1 <k<n 

where the symbol on the right means the maximum of the n numbers |a x |, |a 2 |, .... \a„\. 

(a) Which of the properties of Theorems 12.4 and 12.5 are valid with this definition? 

(b) Use this definition of norm in V 2 and describe on a figure the set of all points (x, y) of 
norm 1. 
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19. If A = (aj , . . . , a,) is a vector in V n , define two norms as follows: 

n 

IMIli = 2 M antl M Ha = max I Ait I • 

k=l l <k<n 

Prove that \\A |j 2 < \\A D < || A ^ . Interpret this inequality geometrically in the plane. 

20. If A and B are two points in n-space, the distance from A to Bis denoted by d{A,B) and is 
defined by the equation d(A, B) = \\ A — B\\. Prove that distance has the following prop- 
erties : 

(a) d(A, B) = d(B, A). (b) d(A, B) = 0 if and only if A = B. 

(c) d(A, B) < d(A, C ) + d(C, B). 

12.12 The linear span of a finite set of vectors 

Let S = {A,, . . . , A,} be a nonempty set consisting of k vectors in Y , where k, the 
number of vectors, maybe less than, equal to, or greater than the dimension of the space. 
If a vector X in V n can be represented as a linear combination of A, , . . , A„ say 

X = 2 c,At , 

»:=l 

then the set S is said to span the vector X. 

DEFINITION . The set of all vectors spanned by S is called the linear span of S and is denoted 

by L{S). 

In other words, the linear span of S is simply the set of all possible linear combinations 
of vectors in S. Note that linear combinations of vectors in L(S) are again in L(S). We 
say that S spans the whole space V n if LfS) = V n . 

example 1. Let S = {A,}- Then L(S) consists of all scalar multiples of A, . 

example 2. Every set S = {A„ . . . , A k } spans thezero vector since 0 = 0 A 1 + > ■ . + 0A k . 
This representation, in which all the coefficients Cj , . . . , c k are zero, is called the trivial 
representation of the zero vector. However, there may be nontrivial linear combinations 
that represent 0. For example, suppose one of the vectors in S is a scalar multiple of 
another, say A, = 2 A 3 . Then we have many nontrivial representations of 0, for example 

O = 2 tA 1 — tA 2 + 0A 3 -)-■■■ + 0 A k , 

where t is any nonzero scalar. 

We are especially interested in sets S that span vectors in exactly one way. 

DEFINITION. A set S = {A A k ] of vectors in V n is said to span X uniquely if S spans 

X and if 

k k 

(12.10) X = ^ c i^i and X = ^ d t A t implies = d t for all i . 

i=l <= l 
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In the two sums appearing in (12.10), it is understood that the vectors A, , . . . , A, are 
written in the same order. It is also understood that the implication (12.10) is to hold for a 
fixed but arbitrary ordering of the vectors A A,. 

theorem 12.7. A set S spans every vector in L(S) uniquely if and only if S spans the 
zero vector uniquely. 

Proof. If S spans every vector in L(S) uniquely, then it certainly spans 0 uniquely. To 
prove the converse, assume S spans 0 uniquely and choose any vector X in L(S). Suppose 
S spans X in two ways, say 

x = X c Ai and x ='£ d t A t . 

i= 1 i=l 

By subtraction, we find that 0 = ^f =1 (c ; — d i )A i ■ But since S spans 0 uniquely, we must 
have c t — d t = 0 for all i, so S spans X uniquely. 

12.13 Linear independence 

Theorem 12.7 demonstrates the importance of sets that span the zero vector uniquely. 
Such sets are distinguished with a special name. 

definition. A set S = {A,, . . . , AJ which spans the zero vector uniquely is said to be 
a linearly independent set of vectors. Otherwise, S is called linearly dependent. 

In other words, independence means that S spans 0 with only the trivial representation: 

Sr 

2 c ( A { = 0 implies all c ( = 0 

i=i 

Dependence means that S spans 0 in some nontrivial way. That is, for some choice of 
scalars , . . . , c k , we have 

k 

2 c t A { — O but not all c t are zero . 

i=i 

Although dependence and independence are properties of sets of vectors, it is common 
practice to also apply these terms to the vectors themselves. For example, the vectors in 
a linearly independent set are often called linearly independent vectors. We also agree to 
call the empty set linearly independent. 

The following examples may serve to give further insight into the meaning of dependence 
and independence. 

example 1. If a subset T of a set S is dependent, then S itself is dependent, because 
if T spans 0 nontrivially, then so does S. This is logically equivalent to the statement that 
every subset of an independent set is independent. 
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example 2. The n unit coordinate vectors E 1 , , E n in V„ span 0 uniquely so they 

are linearly independent. 

example 3. Any set containing the zero vector is dependent. For example, if A, = 0, 
we have the nontrivial representation 0 = + 0 A 2 + ■ 1 ' + 0A k ■ 

example 4. The set S — {i,j, i + j } of vectors in V 2 is linearly dependent because we 
have the nontrivial representation of the zero vector 

0 = i +j + (—1)0’ +/’)• 

In this example the subset T = { i , jj is linearly independent. The third vector, i + j, is 
in the linear span of T. The next theorem shows that if we adjoin to i and j any vector in the 
linear span of T, we get a dependent set. 

THEOREM 12.8. Let S — {A,, . • • , A k } be a linearly independent set of k vectors in V n , 
and let L(S) be the linear span of S. Then, every set of k + 1 vectors in LfS) is linearly 
dependent. 

Proof. The proof is by induction on k, the number of vectors in S. First suppose k= 1. 
Then, by hypothesis, S consists of one vector, say A, , where A, ^ 0 since Sis independent. 
Now take any two distinct vectors B 1 and B 2 in L(S). Then each is a scalar multiple of A„ 
say Bi = C 1 A 1 and B 2 = c i A 1 , where cj and c 2 are not both zero. Multiplying B l by c 2 and 
B 2 by Ci and subtracting, we find that 

c 2 B i — Cxfia = 0 . 

This is a nontrivial representation of 0 so B l and B 2 are dependent. This proves the 
theorem when k = 1. 

Now we assume that the theorem is true for k — 1 and prove that it is also true for k. 
Take any set of k + 1 vectors in US), say T = (B, , B 2 , ... . B k+1 }. We wish to prove that 
T is linearly dependent. Since each B t is in US), we may write 

k 

(12.11) 

3 = 1 

for each i= 1, 2, . . ., k + 1. We examine all the scalars a n that multiply A, and split 
the proof into two cases according to whether all these scalars are 0 or not. 

CASE 1. a n = 0 for every i = 1, 2, ... , k + 1. In this case the sum in (12.11) does not 
involve A, so each 5, in T is in the linear span of the set S’ = {A, , ... , A,}. But S' is 
linearly independent and consists of k — 1 vectors. By the induction hypothesis, the 
theorem is true for k — 1 so the set T is dependent. This proves the theorem in Case 1. 


CASE 2. Not all the scalars a a are zero. Let us assume that a,. 5^ 0. (If necessary, we 
can renumber the B’s to achieve this.) Taking i = 1 in Equation (12.11) and multiplying 
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both members by , where C i = a ;i /a n , we get 

k 

c i B i = J c i a lj A j . 

1=2 

From this we subtract Equation (12.11) to get 

k 

— Bj — ^ (CjtJjj — Of ij)Aj , 

1=2 

for i = 2 ft + I . This equation expresses each of the ft vectors c i B 1 — B t as a linear 

combination of ft — 1 linearly independent vectors A, , . . . , A, . By the induction hy- 
pothesis, the ft vectors c i B 1 — B i must be dependent. Hence, for some choice of scalars 
t 2 , . . . , t k+1 , not all zero, we have 

it+r 

2 hMi - B i) = 0 > 

i = 2 

from which we find 

( k + 1 \ &+1 

2tok-2M*, = o. 

i=2 / t'=2 

But this is a nontrivial linear combination of , . . . , 5 i+1 which represents the zero vector, 
so the vectors B x , ■ ■ ■ , B k+1 must be dependent. This completes the proof. 

We show next that the concept of orthogonality is intimately related to linear inde- 
pendence. 

definition. A set s= {A 1 ,...,A k } of vectors in V n is called an orthogonal set if 
A ( > A • = 0 whenever i ji j. In other words, any two distinct vectors in an orthogonal set 
are perpendicular. 

theorem 12.9. Any orthogonal set S = {A,, . . . ,A k } of nonzero vectors in V n is linearly 
independent. Moreover ; if S spans a vector X, say 

(12-12) X = J Ci A t , 

! = 1 

then the scalar multipliers c k , ... , c k are given by the formulas 

X ■ A, 

(12.13) 2. for j=l2,...,k. 

Proof. First we prove that S is linearly independent. Assume that 2j=i c i^i = 11- 
Taking the dot product of each member with A, and using the fact that A. • A { = 0 for 
each i X 1, we find c 1 (A 1 ' A.) = 0. But (A, < A,) ^ 0 since A. ^ 0, so c 1 = 0. Repeating 
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this argument with A, replaced by A, , we find that each c t = 0. Therefore S spans 0 
uniquely so S is linearly independent. 

Now suppose that S spans X as in Equation (12.12). Taking the dot product of X with 
Aj as above, we find that cfAj < Aj ) = X • A } from which we obtain (12.13). 

If all the vectors A, , , A, in Theorem 12.9 have norm 1, the formula for the multipliers 

simplifies to 

Cj = X-A } .. 

An orthogonal set of vectors {A, , . . , A,}, each of which has norm 1, is called an ortho- 
normal set. The unit coordinate vectors £j , . . . , E n are an example of an orthonormal set. 


12.14 Bases 

It is natural to study sets of vectors that span every vector in V n uniquely. Such sets are 
called buses for V n . 

definition. A set S = {A 1 , , AJ of vectors in V v is called a basis for V n if S spans 

every vector in V n uniquely. I f , in addition, S is orthogonal then S is called an orthogonal 

basis. 


Thus, a basis is a linearly independent set which spans the whole space V„ . The set of 
unit coordinate vectors is an example of a basis. This particular basis is also an orthogonal 
basis. Now we prove that every basis contains the same number of elements. 

THEOREM 12.10. In a given vector space V n , buses have the following properties: 

(a) Every basis contains exactly n vectors. 

(b) Any set of linearly independent vectors is a subset of some basis. 

(c) Any set of n linearly independent vectors is a basis. 

Proof. The unit coordinate vectors E l , ... , E n form one basis for V . If we prove that 
any two bases contain the same number of vectors we obtain (a). 

Let S and T be two bases, where S has k vectors and T has r vectors. If r > k, then T 
contains at least k + 1 vectors in L(S), since L(S) = V„ . Therefore, because of Theorem 
12.8, T must be linearly dependent, contradicting the assumption that T is a basis. This 
means we cannot have r > k, so we must have r < k. Applying the same argument with 
S and T interchanged, we find that k < r. Hence, k = r so part (a) is proved. 

To prove (b), let S = {A A k } be any linearly independent set of vectors in V n ■ 

If L(S) = V n , then S is a basis. If not, then there is some vector X in V„ which is not in 
L(S). Adjoin this vector to S and consider the new set S’ = {A„ . . . , A„ X}. If this set 
were dependent, there would be scalars cq , . . . , c fc+1 , not all zero, such that 

k 

2 C i^i + c k+iX = 0 . 
i=i 

But c k + 1 0 since A, ,..., A. are independent. Hence, we could solve this equation for 
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X and find that Xe L(S), contradicting the fact that X is not in L(S). Therefore, the set 
S’ is linearly independent but contains k + 1 vectors. If L(S') = V n , then S’ is a basis 
and, since S is a subset of S’, part (b) is proved. If S’ is not a basis, we may argue with S’ 
as we did with S, getting a new set S" which contains k + 2 vectors and is linearly inde- 
pendent. If S” is a basis, then part (b) is proved. If not, we repeat the process. We must 
arrive at a basis in a finite number of steps, otherwise we would eventually obtain an inde- 
pendent set with n + 1 vectors, contradicting Theorem 12.8. Therefore part (b) is proved. 

Finally, we use (a) and (b) to prove (c). Let S be any linearly independent set consisting 
of n vectors. By part (b), S is a subset of some basis, say B. But by (a) the basis B has 
exactly n elements, so S = B. 

12.15 Exercises 

1. Let i and j denote the unit coordinate vectors in V s ■ In each case find scalars x and y such that 
x(i -j) + >'(/ + j) is equal to 

(a) »; (b )j\ (c) 3i - 5j; (d) 7i + 5 j. 

2. If A = (1, 2), B = (2, —4), and C = (2, -3) are three vectors in V 2 , find scalars x and y such 
that C = xA + yli. How many such pairs x, y are there? 

3. If A = (2, -1, 1), B = (1, 2, —1), and C = (2, —11, 7) are three vectors in V 3 , find scalars 
x and y such that C = xA+ yB. 

4. Prove that Exercise 3 has no solution ifCis replaced by the vector (2, 11,7). 

5. Let A and B be two nonzero vectors in V n . 

(a) If A and B are parallel, prove that A and B are linearly dependent. 

(b) If A and B are not parallel, prove that A and B are linearly independent. 

6. If (a, b) and (c, d) are two vectors in V 2 , prove that they are linearly independent if and only 
if ad — be ^ 0. 

7. Find all real t for which the two vectors (1 + t, 1 — /) and (1 — t, 1 + t) in V 2 are linearly 
independent. 

8. Let i,j, k be the unit coordinate vectors in V 3 . Prove that the four vectors i. j, k, i +j + k 
are linearly dependent, but that any three of them are linearly independent. 

9. Let i and j be the unit coordinate vectors in V 2 and let S = {/, i + j}. 

(a) Prove that S is linearly independent. 

(b) Prove that j is in the linear span of S. 

(c) Express 3i — 4j as a linear combination of i and i + j. 

(d) Prove that L(S) = V 2 . 

10. Consider the three vectors A = i, B = i + j, and C = i +j + 3k in V 3 . 

(a) Prove that the set (A, B, C}is linearly independent. 

(b) Express each of j and A: as a linear combination of A, B, and C. 

(c) Express 2i — 3j + 5k as a linear combination of A, B, and C. 

(d) Prove that (A, B, Cj is a basis for V 3 . 

11. Let A = (1, 2), B = (2, —4), C = (2, —3), and D = (1, -2) be four vectors in V 2 . Display 
all nonempty subsets of (A, B, C, Dj which are linearly independent. 

12. Let A = (1, 1, 1, 0), B = (0, 1, 1, 1) and C = (1, 1,0, 0) be three vectors in V i . 

(a) Determine whether A, B, C are linearly dependent or independent. 

(b) Exhibit a nonzero vector D such that A, B, C, D are dependent. 

(c) Exhibit a vector E such that A, B, C, E are independent. 

(d) Having chosen E in part (c), express the vector X = (1, 2, 3, 4) as a linear combination of 
A, B. C, E. 

13. (a) Prove that the following three vectors in V 3 are linearly independent: (V3, 1. 0), (1. V3, 1), 

(0,1,V3). 
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(b) Prove that the following three are dependent: (V2, 1, 0), (1, \J 2, 1), (0, 1, y/ 2). 

(c) Find all real t for which the following three vectors in V 3 are dependent: (t, 1, 0), (1, f, 1), 

( 0 , 1 , 0 . 

14. Consider the following sets of vectors in V i . In each case, find a linearly independent subset 
containing as many vectors as possible. 

(a) {(1,0, 1,0), (1, 1, 1, 1), (0, 1, 0, 1), (2,0, -1,0)}. 

(b) {(1, 1, 1, 1), (1, -1, 1, 1), (1, -1, -1, 1), (1, -1, -1, -1)}. 

(c) {(1, 1, 1, 1), (0, 1, 1, 1), (0,0, 1, 1), (0,0, 0, 1)}. 

15. Given three linearly independent vectors A, B, C in V n . Prove or disprove each of the follow- 
ing statements. 

(a) A + B, B + C, A + C are linearly independent. 

(b) A — B, B + C, A + C are linearly independent. 

16. (a) Prove that a set S of three vectors in V 3 is a basis for V 3 if and only if its linear span L(S) 
contains the three unit coordinate vectors i,j, and k. 

(b) State and prove a generalization of part (a) for V n . 

17. Find two bases for V 3 containing the two vectors (0, 1, 1) and (1, 1, 1). 

18. Find two bases for K 4 having onlythe two vectors (0, 1, 1, 1) and (1, 1, 1, 1) in common. 

19. Consider the following sets of vectors inV 3 : 

S = {(1, 1, 1), (0, 1,2), (1, 0, -1)}, T = {(2, 1, 0), (2, 0, -2)}, U = {( 1, 2, 3), (1, 3, 5)}. 

(a) Prove that L(T) c L(S). 

(b) Determine all inclusion relations that hold among the sets L(S), L{T), and L(U). 

20. Let A and B denote two finite subsets of vectors in a vector space V n , and let L(A) and L{B) 
denote their linear spans. Prove each of the following statements. 

(a) If Acs, then L(A) c L(B). 

(b) L(A n B) £ L(A) n L(B). 

(c) Give an example in which L(A C\ B) ^ L(A) (A L(B). 


12.16 The vector space F„(C) of n-tuples of complex numbers 

In Section 12.2 the vector space V n was defined to be the collection of all n-tuples of 
real numbers. Equality, vector addition, and multiplication by scalars were defined in 
terms of the components as follows: If A = (a, , . . . , a,) and B = {b 1 , . . . , b,), then 

A = B means a , = b i foreachi= 1,2 , ■ . . , n, 

A + B = (fib + , , , , , a n -I- b n ) , cA = (c^ , . . . , ca,) . 

If all the scalars a, , b t and c in these relations are replaced by complex numbers, the new 
algebraic system so obtained is called complex vector space and is denoted by V„(C). 
Here C is used to remind us that the scalars are complex. 

Since complex numbers satisfy the same field properties as real numbers, all theorems 
about real vector space V n that use only the field properties of the real numbers are also 
valid for V n (C), provided all the scalars are allowed to be complex. In particular, those 
theorems in this chapter that involve only vector addition and multiplication by scalars 
are also valid for V n (C). 

This extension is not made simply for the sake of generalization. Complex vector spaces 
arise naturally in the theory of linear differential equations and in modern quantum 
mechanics, so their study is of considerable importance. Fortunately, many of the theorems 
about rea l vector space V n carry over without change to V n (C). Some small changes have 
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to be made, however, in those theorems that involve dot products. In proving that the dot 
product A . A of a nonzero vector with itself is positive, we used the fact that a sum of 
squares of real numbers is positive. Since a sum of squares of complex numbers can be 
negative, we must modify the definition of A < B if we wish to retain the positivity property. 
For V,(C), we use the following definition of dot product. 

definition. If A — (a a,) and B = (b, , . . . , b,) are two vectors in V,(C), we 

define their dot product A . B by the formula 

n 

A 'B= 2 A - 

k = l 

where 6, is the complex conjugate of b k . 

Note that this definition agrees with the one given earlier for V n because b k = b k when 
b k is real. The fundamental properties of the dot product, corresponding to those in 
Theorem 12.2, now take the following form. 

theorem 12.11. For all vectors A. B. C in V,(C) and all complex scalars c, we have 

(a ) A. B= iFA, 

(b ) A . (B + C) = A . B + A . C, 

(c) c(A . B) = ( cA ) . B = A < ( cB ), 

(d ) A . A > 0 if A 0, 

(e) A . A = 0 if A = 0. 

All these properties are easy consequences of the definition and their proofs are left as 
exercises. The reader should note that conjugation takes place in property (a) when the 
order of the factors is reversed. Also, conjugation of the scalar multiplier occurs in prop- 
erty (c) when the scalar c is moved from one side of the dot to the other. 

The Cauchy-Schwarz inequality now takes the form 

(12.14) | A ■ B \ 2 < (A . A)(B . B) . 

The proof is similar to that given for Theorem 12.3. We consider the vector C = xA — yB, 
where x = B ■ B and y = A . B. and compute C • C. The inequality C 1 C > 0 leads to 

(12.14) . Details are left as an exercise for the reader. 

Since the dot product of a vector with itself is nonnegative, we can introduce the norm 
of a vector in V,(C) by the usual formula, 

Mil = (A . Ap 2 . 

The fundamental properties of norms, as stated in Theorem 12.4, are also valid without 
change for V,(C). The triangle inequality, M+ 5|| < || A || + ||Z?||, also holds in VfC). 

Orthogonality of vectors in V,(C) is defined by the relation A . B = 0. As in the real 
case, two vectors A and B in K„(C) are orthogonal whenever they satisfy the Pythagorean 
identity, \\A + B || 2 = MU 2 + Mil 2 - 



470 


Vector algebra 


The concepts of linear span, linear independence, linear dependence, and basis, are defined 
for K„(C) exactly as in the real case. Theorems 12.7 through 12.10 and their proofs are all 
valid without change for V n (C). 


12.17 Exercises 

1. Let A = (1, i), B = (i, — /), and C = (2/, 1) be three vectors in V,(C). Compute each of the 
following dot products : 

(a) A > B; (b) B ■ A; (c) ( iA ) . B; (d) A . (t'B); (e) ( iA ) . (iB); 

(f) B . C; (g) A . C; (h) (B + C) ■ A; (i) (A - C) • B; 

(j) (A - iB) ■ (A + iB), 

2. If A = (2, 1, -i) and B = (i, -1, 2/), find a nonzero vector C in F 3 (C) orthogonal to both A 
and B. 

3. Prove that for any two vectors A and .Bin V,(C), we have the identity 

\\A + B|| 2 = \\A || 2 + |[ Bf + A.B + A.B. 

4. Prove that for any two vectors A and B in V,(C), we have the identity 

\\A +B|| 2 -\\A ~B|| 2 = 2 (A-B + A-B). 

5. Prove that for any two vectors A and Bin V,(C), we have the identity 

I \A+ B || 2 + \\A — B|| 2 = 2 \\A\\ 2 + 2 || fif, 

6. (a) Prove that for any two vectors A and B in V,(C), the sum A . B + A . B is real. 

(b) If A and B are nonzero vectors in V,(C), prove that 


-2 < 


A-B + A-B 

Mil 11*11 


< 2 . 


7. We define the angle 8 between two nonzero vectors A and B in V n (C) by the equation 


6 = arccos 


UA.B +A.B) 


Ml 


IB || 


The inequality in Exercise 6 shows that there is always a unique angle 0 in the closed interval 
0 <8 <tt satisfying this equation. Prove that we have 

II A - Bf = [|A|| 2 + ||B|| 2 - 2 M I, IIB|; cose. 


8. Use the definition in Exercise 7 to compute the angle between the following two vectors in 

V,(C): A = (1, 0, i, i, i), and B = (i, i, i, 0, /). 

9. (a) Prove that the following three vectors form a basis for V,(C): A = (1, 0, 0), B = (0, i, 0), 
C = (1, 1, /). 

(b) Express the vector (5, 2 — i, 2i) as a linear combination of A, B, C. 

10. Prove that the basis of unit coordinate vectors E 1 , ■ ■ ■ ,E n in V n is also a basis for V,(C). 
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APPLICATIONS OF VECTOR ALGEBRA 
TO ANALYTIC GEOMETRY 


13.1 Introduction 

This chapter discusses applications of vector algebra to the study of lines, planes, and 
conic sections. In Chapter 14 vector algebra is combined with the methods of calculus, and 
further applications are given to the study of curves and to some problems in mechanics. 

The study of geometry as a deductive system, as conceived by Euclid around 300 b.c., 
begins with a set of axioms or postulates which describe properties of points and lines. 
The concepts “point” and “line” are taken as primitive notions and remain undefined. 
Other concepts are defined in terms of points and lines, and theorems are systematically 
deduced from the axioms. Euclid listed ten axioms from which he attempted to deduce all 
his theorems. It has since been shown that these axioms are not adequate for the theory. 
For example, in the proof of his very first theorem Euclid made a tacit assumption concern- 
ing the intersection of two circles that is not covered by his axioms. Since then other lists 
of axioms have been formulated that do give all of Euclid’s theorems. The most famous 
of these is a list given by the German mathematician David Hilbert (1862-1943) in his now 
classic Grundlagen der Geometrie, published in 1899. (An English translation exists: 
The Foundations of Geometry, Open Court Publishing Co., 1947.) This work, which went 
through seven German editions in Hilbert’s lifetime, is said to have inaugurated the abstract 
mathematics of the twentieth Century. 

Hilbert starts his treatment of plane geometry with five undefined concepts: point, line, 
on (a relation holding between a point and a line), between (a relation between a point and a 
pair of points), and congruence (a relation between pairs of points). He then gives fifteen 
axioms from which he develops all of plane Euclidean geometry. His treatment of solid 
geometry is based on twenty-one axioms involving six undefined concepts. 

The approach in analytic geometry is somewhat different. We define concepts such as 
point, line, on, between, etc., but we do so in terms of real numbers, which are left un- 
defined. The resulting mathematical structure is called an analytic model of Euclidean 
geometry. In this model, properties of real numbers are used to deduce Hilbert’s axioms. 
We shall not attempt to describe all of Hilbert’s axioms. Instead, we shall merely indicate 
how the primitive concepts may be defmed in terms of numbers and give a few proofs to 
illustrate the methods of analytic geometry. 
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13.2 Lines in n-space 

In this section we use real numbers to define the concepts of point, line, and on. The 
definitions are formulated to fit our intuitive ideas about three-dimensional Euclidean 
geometry, but they are meaningful in n-space for any n > 1. 

A point is simply a vector in V n , that is, an ordered n-tuple of real numbers; we shall use 
the words “point” and “vector” interchangeably. The vector space V n is called an analytic 
model of n-dimensional Euclidean space or simply Euclidean n-space. To define “line,” we 
employ the algebraic operations of addition and multiplication by scalars in V n . 

definition. Let P be a given point and A a given nonzero vector. The set of all points 
of tlw form P + t A, where t runs through all real numbers, is called a line through P parallel 
to A We denote this line by L(P; A) and write 

L(P; A) = fP + tA\t real } or, more briefly, L(P;A) ~ {P + tAj . 

A point Q is said to be on the line L{P\ A) if Q £ L{P\ A). 


In the symbol L(P;A), the point P which is written first is on the line since it corresponds 
to t = 0. The second point, A, is called a direction vector for the line. The line L(0; A) 
through the origin 0 is the linear span of A; it consists of all scalar multiples of A The 
line through P parallel to A is obtained by adding P to each vector in the linear span of A 
Figure 13.1 shows the geometric interpretation of this definition in V 3 . Each point P + tA 
Can be visualized as the tip of a geometric vector drawn from the origin. As / varies over 
all the real numbers, the corresponding point P + tA traces out a line through P parallel 
to the vector A Figure 13.1 shows points corresponding to a few values of t on both lines 
L(P; A) and L{0\ A). 



Figure 13.1 The line L(P\ A) through P parallel to A and its geometric relation to 
the line 1.(0; A) through 0 parallel to A. 
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13.3 Some simple properties of straight lines 

First we show that the direction vector A which occurs in the definition of L(P', A) can 
be replaced by any vector parallel to A. (We recall that two vectors A and B are called 
parallel if A = cB for some nonzero scalar c.) 


theorem 13.1. Two lines L(P', A) and L(P', B) through the same point P are equal if 
and only if the direction vectors A and B are parallel. 

Proof. Assume first that L(P,A) = L(P; B). Take a point on L(P',A) other than P, 
for example, P + A. This point is also on L(P ; B) so P + A = P + cB for some scalar c. 
Hence, we have A = cB and c =4 0 since A ^ 0. Therefore, A and B are parallel. 

Now we prove the converse. Assume A and B are parallel, say A = cB for some c ^ 0. 

If Q is on L(P\ A), then we have Q = P + tA = P + t(cB) = P + ( ct)B , SO Q is on 
L(P' B). Therefore L(P;A) £ L(P\B). Similarly, L(P,B) c L(P',A), so L(P; A) — L(P\ B). 

Next we show that the point P which occurs in the definition of L(P:A) can be replaced 
by any other point Q on the same line. 

theorem 13.2. Two lines L{P\ A) and L(Q\ A) with the same direction vector A are 
equal if and only if Q is on L(P ; A). 

Proof. Assume L(P; A) = L{Q,A). Since Q is on L(Q\A), Q is also on L(P; A). 
To prove the converse, assume that Q is on L(P;A), say Q= P + cA. We wish to prove 
that L{P:A) = L(Q,A). If X e L(P;A), then X = P + tA for some t. But P = Q — cA, 
so X = Q " + tA = Q + ( t — c)A, and hence X is also on L(Q\ A). Therefore 

L(P', A) £ L(Q;A). Similarly, we find L(Q',A) c L(P,A), so the two lines are equal. 

One °f Euclid’s famous postulates is the parallel postulate which is logically equivalent 
to the statement that “through a given point there exists one and only one line parallel to a 
given line.” We shall deduce this property as an easy consequence of Theorem 13.1. 
First we need to define parallelism of lines. 


definition. Two lines L{P\A) ami L(Q', B) are called parallel if their direction vectors 
A and B are parallel. 


theorem 13.3. Given a line L and a point Q not on L, then there is one and only one 
line L containing Q andparallel to L. 

Proof. Suppose the given line has direction vector A. Consider the line L’ = L(Q;A). 
This line contains Q and is parallel to L. Theorem 13.1 tells us that this is the only line 
with these two properties. 

Note: For a long time mathematicians suspected that the parallel postulate could 

be deduced from the other Euclidean postulates, but all attempts to prove this resulted 
in failure. Then in the early Nth Century the mathematicians Karl F. Gauss (1777-1855), 
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J. Bolyai (1802-1860), and N. 1. Lobatchevski (1793-1856) became convinced that the 
parallel postulate could not be derived from the others and proceeded to develop non- 
Euclidean geometries, that is to say, geometries in which the parallel postulate does not 
hold. The work of these men inspired other mathematicians and scientists to enlarge 
their points of view about “accepted truths” and to challenge other axioms that had been 
considered sacred for centuries. 

It is also easy to deduce the following property of lines which Euclid stated as an axiom. 

theorem 13.4. Two distinct points determine a line. That is, if P ^ Q. there is one 
caul only one line containing both P and Q. It can be described as the set fP + t(Q — P)j. 

Proof. Let L be the line through P parallel to Q — P, that is, let 

L = L(P ; Q ™ P) = (P + t(Q — P)l . 

This line contains both P and Q (take t = 0 to get P and f = 1 to get Q). Now let L' be 
any line containing both P and Q. We shall prove that L' = L. Since L’ contains P. we 
have L’ = L(P;A) for some A ^ 0. But L’ also contains Q so P + cA = Q for some c. 
Hence we have Q “ P = cA, where c f 0 since Q ^ P. Therefore Q — P is parallel to A 
SO, by Theorem 13.2, we have L’ = L(P\A) = L(P; Q — P) = L. 

example. Theorem 13.4 gives us an easy way to test if a point Q is on a given line 
L(P: A). It tells us that Q is on L(P\A) if and only if Q — P is parallel to A. For example, 

consider the line L(P;A), where P = (1, 2, 3) and A = (2, — 1, 5). Tb test if the point 

Q = (1, 1,4) is on this line, we examine Q *» P = (0, — 1, 1). Since Q — P is not a scalar 
multiple of A, the point (1, 1, 4) is not on this line. On the other hand, if Q = (5, 0, 13), 

we find that Q — P = (4, -2, 10) = 2A, so this Q is on the line. 

Linear dependence of two vectors in V n can be expressed in geometric language. 

theorem 13.5. Two vectors A and B in V n are linearly dependent if and only if they lie 
on the same line through the origin. 

Proof. If either A or B is zero, the result holds trivially. If both are nonzero, then A 
and B are dependent if and only if B = tA for some scalar t. But B = tA if and only if B 
lies on the line through the origin parallel to A 

13.4 Lines and vector-valued functions 

The concept of a line can be related to the function concept. The correspondence which 
associates to each real t the vector P + tA on the line L(P\ A) is an example of a function 
whose domain is the set of real numbers and whose range is the line L{P\A). If we denote 
the function by the symbol X, then the function value X(t) at t is given by the equation 

(13.1) X(t) = P + tA . 


We call this a vector-valued function of a real variable. 
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The function point of view is important because, as we shall see in Chapter 14, it provides 
a natural method for describing more general space curves as well. 

The scalar t in Equation (13.1) is often called aparameter, and Equation (13.1) is called a 
vector parametric equation or, simply a vector equation of the line. Occasionally it is con- 
venient to think of the line as the track of a moving particle, in which case the parameter t 
is referred to as time and the vector X(t) is called the position vector. 

Note that two points X(a) and X(b) on a given line L(P; A) are equal if and only if we have 
P -j- aA = P + bA, or (a « b)A = O. Since A 0, this last relation holds if and only if 
a = b. Thus, distinct values of the parameter t lead to distinct points on the line. 

Now consider three distinct points on a given line, say X(a), X(b), and X(c), where a > b. 
We say that X(c) is between X(a) and X(b) if c is between a and b, that is, if a < c < b. 

Congruence can be defined in terms of norms. A pair of points P, Q is called congruent 
to another pair P\ Q’ if HZ’ — Q\\ = ||P' — Q'\\. The norm HP — £)|! is also called the 
distance between P and Q. 

This completes the definitions of the concepts of point, line, on, between, and congruence 
in our analytic model of Euclidean «-space. We conclude this section with some further 
remarks concerning parametric equations for lines in 3-Space. 

If a line passes through two distinct points P and Q, we can use Q — P for the direction 
vector A in Equation (13.1); the vector equation of the line then becomes 

X(t) = P + t{Q - P) or X(t) = tQ + (1 - t)P . 

Vector equations can also be expressed in terms of components. For example, if we 
write P = (p, q, r), A = (a, b, c) and X(t) = (x, y, z), Equation (13.1) is equivalent to the 
three scalar equations 

(13.2) x =p + ta, y — q + tb, z = r + tc. 

These are called scalar parametric equations or simply parametric equations for the line; 
they are useful in computations involving components. The vector equation is simpler 
and more natural for studying general properties of lines. 

If all the vectors are in 2-space, only the first two parametric equations in (13.2) are 
needed. In this case, we can eliminate t from the two parametric equations to obtain the 
relation 

(13.3) b(x — p) — a(y — q) = 0, 

which is called a Cartesian equation for the line. If a ^ 0, this can be written in the point- 
slope form 

b . . 

y -« = -(*- p) 

The point (p, q) is on the line; the number bja is the slope of the line. 

The Cartesian equation (13.3) can also be written in terms of dot products. If we let 
N = ( b , —a), X = (x, y), and P = (p, q). Equation (13.3) becomes 


(X -P) ■ N = 0 or X- N = P- N. 



476 


Applications of vector algebra to analytic geometry 


The vector N is perpendicular to the direction vector A since N • A = bd — clb = 0; the 
vector N is called a normal vector to the line. The line consists of all points X satisfying 
the relation (X — P) • N = 0. 

The geometric meaning of this relation is shown in Figure 13.2. The points P and X are 
on the line and the normal vector N is orthogonal to X — P. The figure suggests that among 
all points X on the line, the smallest length |j X\ occurs when X is the projection of P along 
N. We now give an algebraic proof of this fact. 


Y 



Figure 13.2 A line in the xy -plane through P with normal vector N. Each point X 
on the line satisfies (X — P) < N = 0. 

theorem 13.6. Let L be the line in V 2 consisting of all points X satisfying 

X ■ N = P ■ N , 

tv here P is on the line and N is a nonzero vector normal to the line. L et 


ill 

IIMI 


Then every X on L has length || X\\ > d. Moreover, | X\\ = d if and only if X is the pro- 
jection of P along N: 

P ■ N 

X = tN, where t = ^ . 

N ■ N 


Proof. If Xr L, we have X ' N = P 1 N. By the Cauchy-Schwarz inequality, we have 

\p , jvi =i^;vi< 11*11 m, 

which implies || A"|l > |.P • jV|/||A|j = d. The equality sign holds if and only if X = tN 
for some scalar t, in which case P • N = X 1 N = tN • N, so t = P • N/N 1 N. This com- 
pletes the proof. 
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In the same way we can prove that if Q is a given point in V 2 not on the line L, then for 
all X on L the smallest value of ||3f — 211 is \{P ~ Q) ' Af|/||iV||, and this occurs when 
X — Q is the projection of P — Q along the normal vector N. The number 

l(f -Q)'N\ 

m 

is called the distance from the point Q to the line L. The reader should illustrate these con- 
cepts on a figure similar to that in Figure 13.2. 

13.5 Exercises 

1. A line L in V 2 contains the two points P = (-3, 1) and Q = (1, 1). Determine which of the 
following points are on L. (a) (0, 0); (b) (0, 1); (c) (1,2); (d) (2, 1); (e) (-2, 1). 

2. Solve Exercise 1 if P = (2, -1) and Q = (—4, 2). 

3. A line L in V 3 contains the point P = (-3, 1, 1) and is parallel to the vector (1, -2, 3). 
Determine which of the following points are on L. (a) (0, 0, 0) ; (b) (2, — 1 , 4) ; (c) (-2, — 1,4); 

(d) (-4, 3. -2); (e) (2, -9, 16). 

4. A line L contains the two points P = (-3, 1, 1) and Q = (1, 2, 7). Determine which of the 
following points are on L. (a) (-7,0, 5); (b)( -7,0, -5); (c)(- 11,1,11); (d)(-ll, —1,11); 

(e) (-1,1,4); (f) (— f, 3); (g) (-1, t, -4). 

5. In each case, determine if all three points P, Q, R lie on a line. 

(a) P = (2, 1, 1), Q = (4, 1, -1), R = (3, -1, 1). 

(b) P = (2, 2, 3), Q = (-2, 3, 1), R = (-6, 4, 1). 

(e) P = (2, 1, 1), Q = (-2, 3, 1), R = (5, -1, 1). 

6. Among the following eight points, the three points A, B, and C lie on a line. Determine all 

subsets of three or more points which lie on a line: A = (2, 1, 1), B = (6, -1, 1), C = 
(■6, 5, 1), D = (- 2 , 3 , 1 ), E = ( 1 , 1 , 1 ), F = (-4,4, 1 ), G = (- 13 , 9 , 1 ), H = ( 14 , -6, 1 ). 

7. A line through the point P = (1, 1, 1) is parallel to the vector A = (1, 2, 3). Another line 

through Q = (2, 1, 0) is parallel to the vector B= (3, 8, 13). Prove that the two lines intersect 
and determine the point of intersection. 

8. (a) Prove that two lines L(P; A) and L(Q;B) in V n intersect if and only if P — Q is in the 
linear span of A and B. 

(b) Determine whether or not the following two lines in V 3 intersect : 

L = ((1, 1, -1) + t(—2, 1, 3)} , V = ((3, - 4 , 1) + t( -1, 5, 2)} . 

9. Let X(t) = P + tA be an arbitrary point on the line L{P\ A), where P = (1, 2, 3) and A = 
(1, —2, 2), and let Q = (3, 3, 1). 

(a) Compute \\Q — X(/)|j 2 , the square of the distance between Q and X(t). 

(b) Prove that there is exactly one point X(t^) tor which the distance j Q — X(t) |i is a minimum, 
and compute this minimum distance. 

(c) Prove that Q — X(! 0 ) is orthogonal to A. 

10. Let Q be a point not on the line L(P;A) in V n . 

(a) Let fit) = J Q — X(t ) || 2 , where X(t) = P + tA. Prove that f(t) is a quadratic polynomial 
in t and that this polynomial takes on its minimum value at exactly one t, say at t = t 0 . 

(b) Prove that Q — X(t 0 ) is orthogonal to A. 

11. Given two parallel lines L(P\ A) and L(Q\ A) in V n . Prove that either L{P\A) = L(Q; A) 
or the intersection L(P; A) r\L(Q; A) is empty. 

12. Given two lines L{P\A) and L(Q ,B) in V n which are not parallel. Prove that the intersection 
is either empty or consists of exactly one point. 
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13.6 Planes in Euclidean n-space 

A line in n-space was defined to be a set of the form {P + tA} obtained by adding to a 
given point P all vectors in the linear span of a nonzero vector A. A plane is defined in a 
similar fashion except that we add to P all vectors in the linear span of two linearly inde- 
pendent vectors A and B. To make certain that V n contains two linearly independent 
vectors, we assume at the outset that « > 2. Most of our applications will be concerned 
with the case n = 3. 



figure 13.3 The plane through P spanned by A and B, and its geometric relation 
to the plane through 0 spanned by A and B. 

definition. A set M of points in V n is called a plane if there is a point P and two linearly 

independent vectors A and B such that 

M = {P + sA + tB s,t real} . 

We shall denote the set more briefly by writing M = (P + sA + tB}. Each point of A4 
is said to be on the plane. In particular, taking s = t = 0, we see that P is on the plane. The 
set (P + sA + tB} is also called the plane through P spanned by A and B. When P is the 
origin, the plane is simply the linear span of A and B. Figure 13.3 shows a plane in V 3 
through the origin spanned by A and B and also a plane through a nonzero point P spanned 
by the same two vectors. 

Now we shall deduce some properties of planes analogous to the properties of lines given 
in Theorems 13.1 through 13.4. The first of these shows that the vectors A and B in the 
definition of the plane (P + sA + tB} can be replaced by any other pair which has the 
same linear span. 

theorem 13.7. Two planes M = (P + sA + tB} and M’ = (P + sC + tD} through 
the same point P are equal if and only if the linear span of A and B is equal to the linear 
span of C and D. 
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Proof. If the linear span of A and B is equal to that of C and D, then it is clear that 
M = M' . Conversely, assume that A4 = M’ . Plane M contains both P + A and P + B. 
Since both these points are also on M’, each of A and B must be in the linear span of C 
and D. Similarly, each of C and D is in the linear span of A and B. Therefore the linear 
span of A and B is equal to that of C and D . 

The next theorem shows that the point P which occurs in the definition of the plane 
{P + sA + tBj can be replaced by any other point Q on the same plane. 

theorem 13.8. Two planes M = {P + sA + tB} and M ' = {Q + sA + tB) spanned by 
the same vectors A and B are equal if and only if 0 isonM. 

Proof. If M = M’, then Q is certainly on M, To prove the converse, assume Q is on 
M, say Q = P + aA + bB. Take any point X in M. Then X = P + sA + tB for some 
scalars s and 1. But P = Q — a A — bB, so X = Q -T (s — a)A + (t — b)B. Therefore 
X is in M', so M c M'. Similarly, we find that M' c M, so the two planes are equal. 

Euclid’s parallel postulate (Theorem 13.3) has an analog for planes. Before we state this 
theorem we need to define parallelism of two planes. The definition is suggested by the 
geometric representation in Figure 13.3. 

definition. Two planes M = -fP + sA + tB} and M’ = {Q + sC + tD } are said to 
be parallel if the linear span of A and B is equal to the linear span of C and D. We also say 
that a vector X is parallel to the plane M ifX is in the linear span of A and B. 

theorem 13.9. Given a plane M and a point Q not on M , there isone and only one plane 
M ' which contains Q and is parallel to M . 

Proof. Let M = {P + sA + tB) and consider the plane M' = {Q + sA + tBj. This 
plane contains Q and is spanned by the same vectors A and B which span M . Therefore 
M' is parallel to M. If M" is another plane through Q parallel to M, then 

M" = {Q + sC + tD} 

where the linear span of C and D is equal to that of A and B. By Theorem 13.7, we must 
have M " = M ’. Therefore M’ is the only plane through Q which is parallel to M . 

Theorem 13.4 tells us that two distinct points determine a line. The next theorem shows 
that three distinct points determine a plane, provided that the three points are not collinear. 

theorem 13.10. If P, Q, and R are three points not on thesame line, then there isone 
and only one plane M containing these three points. It can be described as the set 


(13.4) 


M = {P + s(Q - P) + t(R - P)} . 
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Proof. We assume first that one of the points, say P, is the origin. Then Q and R are 
not on the same line through the origin so they are linearly independent. Therefore, they 
span a plane through the origin, say the plane 

AT = {,s<2 + tR} . 

This plane contains all three points 0, Q, and R. 

Now we prove that M’ is the only plane which contains all three points 0, Q, and R. 
Any other plane through the origin has the form 

M" = {sA + tB } , 

where A and B are linearly independent. If M" contains Q and R. we have 
(13.5) Q = aA + bB, R = cA + dB , 

for some scalars a, b, c, d. Hence, every linear combination of Q and R is also a linear 
combination of A and B, so M' CM”. 

To prove that M" c M’, it suffices to prove that each of A and R is a linear combination 
of Q and R. Multiplying the first equation in (13.5) by d and the second by b and sub- 
tracting, we eliminate B and get 


(ad — bc)A = dQ — bR . 

Now ad — be cannot be zero, otherwise Q and R would be dependent. Therefore we can 
divide by ad — be and express A as a linear combination of Q and R. Similarly, we can 
express B as a linear combination of Q and R, so we have M" £ M’. This proves the 
theorem when one of the three points P, Q, R is the origin. 

To prove the theorem in the general case, let M be the set in (13.4), and let C == Q — P, 

D = R — P. First we show that C and D are linearly independent. If not we would have 
D = tC for some scalar t, giving us R — P = t(Q — P), or R = P + t(Q — P ), contra- 
dicting the fact that P , Q, R are not on the same line. Therefore the set M is a plane 
through P spanned by the linearly independent pair C and D. This plane contains all three 
points P, Q, and R (take s = 1, t = 0 to get Q, and s = (),/= I to get R). Now we must 
prove that this is the only plane containing P, Q, and R. 

Let M’ be any plane containing P , Q, and R. Since M' is a plane containing P, we have 

M’ = {P + sA + tB } 

for some linearly independent pair A and B. Let = {sA + tB} be the plane through the 
origin spanned by the same pair A and B. Clearly, M' contains a vector X if and only if 
M' 0 contains X — P. Since M’ contains Q and R, the plane M' contains C = Q — P and 
D = R — P. But we have just shown that there is one and only one plane containing 0, 

C, and D since C and D are linearly independent. Therefore M ^ = { sC + tD }, SO M’ = 
jP + sC + tD} = M. This completes the proof. 


In Theorem 13.5 we proved that two vectors in V n are linearly dependent if and only if 
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they lie on a line through the origin. The next theorem is the corresponding result for three 
vectors. 

THEOREM 13.11. Three vectors A, B. C in V n are linearly dependent if and only if they 
lie on the same plane through the origin. 

Proof. Assume A, B, C are dependent. Then we can express one of the vectors as a 
linear combination of the other two, say C = sA + tB. If A and B are independent, they 
span a plane through the origin and C is on this plane. If A and B are dependent, then 
A, B, and C lie on a line through the origin, and hence they lie on any plane through the 
origin which contains all three points A, B, and C. 

To prove the converse, assume that A, B, C lie on the same plane through the origin, say 
the plane M. If A and B are dependent, then A, B. and C are dependent, and there is 
nothing more to prove. If A and B are independent, they span a plane M’ through the 
origin. By Theorem 13.10, there is one and only one plane through 0 containing A and B. 
Therefore M' = M. Since C is on this plane, we must have C = sA + tB, so A, B, and 
C are dependent. 


13.7 Planes and vector-valued functions 

The correspondence which associates to each pair of real numbers S and l the vector 
P + sA + tB on the plane M = (P + sA + tB} is another example of a vector-valued 
function. In this case, the domain of the function is the set of all pairs of real numbers 
(s, t) and its range is the plane M. If we denote the function by X and the function values 
by X(s, t), then for each pair (s, t) we have 

(13.6) X(s, t) = P + sA + tB . 

We call X a vector-valued function of two real variables. The scalars s and t are called 
parameters, and the equation (13.6) is called a parametric or vector equation of the plane. 
This is analogous to the representation of a line by a vector-valued function of one real 
variable. The presence of two parameters in Equation (13.6) gives the plane a two- 
dimensional quality. When each vector is in V 3 and is expressed in terms of its components, 
say 


P = (Pi,P 2 ,Pn), A = (a lt fl 3 ) , B = , b. L ,b 3 ) , and X(s, t) = (x, y, z) , 

the vector equation (13.6) can be replaced by three scalar equations, 

X = Pi + sa, + th 1 , y ~ p 2 + sa 2 -f th,, z = Pa + sa d + tb 3 . 

The parameters v and t can always be eliminated from these three equations to give one 
linear equation of the form ax + by + cz = d, called a Cartesian equation of the plane. 
We illustrate with an example. 
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example. Let M = {P + sA + tB}, where P = (1, 2, 3). A = (1, 2, 1), and B = 
(1, -4, -1). The corresponding vector equation is 

X(s, t) = (1,2,3) + 41,2, 1)+ t( 1, —4, —1) ■ 

From this we obtain the three scalar parametric equations 

x = 1 + s + f , y = 2 + 2s — 4t , z = 3 + s — t . 

To obtain a Cartesian equation, we rewrite the first and third equations in the form X — 1 = 
S + t, z — 3 = s — t. Adding and then subtracting these equations, we find that 2s = 
X + z — 4, 2t = X — z + 2. Substituting in the equation for y, we are led to the Cartesian 
equation x + y — 3z = -6. We shall return to a further study of linear Cartesian equa- 
tions in Section 13.16. 


13.8 Exercises 

1. Let M = {P + sA + tB}, where P = (1, 2, —3), A = (3, 2, 1), and B = (1, 0, 4). Determine 
which of the following points are on M 

(a) (1,2,0); (b) (1,2, 1); (c) (6,4, 6); (d) (6,6, 6); (e) (6, 6, -5). 

2. The three points P = (1, 1, —1), Q = (3, 3, 2), and R = (3, -1, -2) determine a plane M. 
Determine which of the following points are on M. 

(a) (2, 2, i); (b) (4, 0, -*); (c) (-3, 1, - 5 ) ; (d) (3, 1, 3); (e) (0, 0, 0). 

3. Determine scalar parametric equations for each of the following planes. 

(a) The plane through (1, 2, 1) spanned by the vectors (0, 1, 0) and (1, 1,4). 

(b) The plane through (1, 2, 1), (0, 1, 0), and (1, 1, 4). 

4. A plane M has scalar parametric equations 

x = 1 + s — 2t, y = 2 + s + 4t, z = 2s + t. 

(a) Determine which of the following points are on M: (0, 0, 0), (1, 2, 0), (2, -3, -3). 

(b) Find vectors P. A, and B such that M = (P + sA + tB). 

5. Let M be the plane determined by three points P, Q, R not on the same line. 

(a) If p, q, r are three scalars such that p + q + r = 1, prove that pP + qQ + rR is on M. 

(b) Prove that every point on M has the form pP + qQ + rR, wherep + q + r = 1. 

6. Determine a linear Cartesian equation of the form ax + by + cz = d for each of the following 
planes. 

(a) The plane through (2, 3, 1) spanned by (3, 2, 1) and (-1, -2, -3). 

(b) The plane through (2, 3, 1), (-2, -1, —3), and (4, 3, 1). 

(c) The plane through (2, 3, 1) parallel to the plane through the origin spanned by (2, 0,-2) 
and (1, 1, 1). 

7. A plane M has the Cartesian equation 3x — 5^ + z = 9. 

(a) Determine which of the following points are on M: (0, -2, —1), (-1, -2, 2), (3, 1, -5). 

(b) Find vectors P, A, and B such that M = {P + sA + tB). 

8. Consider the two planes M = IP + sA + tB} and AT = IQ + sC + tD }, where P = (1, 1, 1), 
A = (2, -1, 3), B = (-1, 0, 2), Q = (2, 3, 1), C = (1, 2, 3), and D = (3, 2, 1). Find two 
distinct points on the intersection M n M’. 

9. Given a plane M = {P + sA + tB}, where P = (2, 3, 1), A = (1, 2, 3), and B = (3, 2, 1), and 
another plane JVT with Cartesian equation x — 2_y + z = 0. 

(a) Determine whether M and M’ are parallel. 
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(b) Find two points on the intersection M’ nM" if M” has the Cartesian equation 

x + 2y + z = 0 . 

10. Let L be the line through (1, l, 1) parallel to the vector (2, -1, 3), and let M be the plane 
through (1, 1. -2) spanned by the vectors (2, 1, 3) and (0, 1, 1). Prove that there is one and 
only one point on the intersection LnM and determine this point. 

1 1. A line with direction vector X is said to be parallel to a plane M if X is parallel to M. Let 
L be the line through (1, 1. 1) parallel to the vector (2, — 1, 3). Determine whether L is parallel 
to each of the following planes. 

(a) The plane through (1, ], -2) spanned by (2, 1, 3) and (§, 1, 1). 

(b) The plane through (1,1, -2), (3, 5, 2), and (2, 4, -1). 

(c) The plane with Cartesian equation x + 2_y + 3z = -3. 

12. Two distinct points P and Q lie on a plane M. Prove that every point on the line through P 
and Q also lies on M. 

13. Given the line L through (1, 2, 3) parallel to the vector (1, 1, 1), and given a point (2, 3, 5) 
which is not on L. Find a Cartesian equation for the plane M through (2, 3, 5) which contains 
every point on L. 

14. Given a line L and a point P not on L. Prove that there is one and only one plane through 
P which contains every point on L. 


13.9 The cross product 

In many applications of vector algebra to problems in geometry and mechanics it is 
helpful to have an easy method for constructing a vector perpendicular to each of two 
given vectors A and B. This is accomplished by means of the cross product A x B (read 
“A cross B”) which is defined as follows: 


definition. Let A = (a, , c? 2 , and B = (b, , b 2 < b 3 )be two vectors in V 3 - Their cross 
product A x B (in that order) is defined to be the vector 

A x B = (a 2 b z — a 3 b 2 , a 3 b 1 — afb 3 , af> 2 — a 2 bf) . 

The following properties are easily deduced from this definition. 


13.12. For all vectors A, B, C in V 3 and for all real c we have: 


THEOREM 

(a ) A x B = -(B x A) 

(b) A x (B + C) = (A x B) + (A x C) 

(c) c(A x B) = (cA) x B, 

(d) A . (A x B) = 0 

(e) B ■ (A x B) = 0 

(f) \\A X S|| 2 = ||/l|| 2 |!5|| 2 (A- Bf 

(g ) A x B = 0 if and only if A and B are linearly dependent. 


(skew symmetry), 
(distributive law), 

(orthogonality to A), 
(orthogonality to B), 
(Lagrange’s identity), 


Proof. Parts (a), (b), and (c) follow quickly from the definition and are left as exercises 
for the reader. Tt> prove (d), we note that 


A' (A X B) = afa^b 3 - a 3 b 2 ) + afa 3 b 1 — afb 3 ) + a 3 (a 1 b 2 - a 2 bf = 0 . 
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Part (e) follows in the same way, or it can be deduced from (a) and (d). To prove (f), we 

write 

\\A x 5 1] 2 = (a 2 ft 3 — a 3 b 2 ) 2 + — a } b 3 ) 2 + (af., — a 2 bff 

and 

|| A II 2 1 B II a - (A • Bf = (a 2 + a\ + fl 2 )(ft( + b\ + ft 2 ) - (afy + a 2 b 2 + a 3 b 3 ) 2 

and then verify by brute force that the two right-hand members are identical. 

Property (f) shows that A x B = 0 if and only if (A > B) 2 = || A || 2 1| B\\ 2 . By the Cauchy- 
Schwarz inequality (Theorem 12.3), this happens if and only if one of the vectors is a scalar 
multiple of the other. In other words, A x B = 0 if and only if A and B are linearly 
dependent, which proves (g). 


examples . Both (a) and (g) show that A x A = 0. From the definition of cross product 
we fmd that 

i x j = k , j x k = i , k x i = j . 

The cross product is not associative. For example, we have 

i x ( i xj ) = i x k = - j but (ixi)xj = 0xj=0. 

The next theorem describes two more fundamental properties of the cross product. 


theorem 13.13. Let A and B be linearly independent vectors in V 3 . Then we have the 
following: 

(a) The vectors A. B, A x B are linearly independent. 

(b) Every vector N in V 3 orthogonal to both A and B is a scalar multiple of A x B. 

Proof. Let C = A x B. Then C ?i0 since A and B are linearly independent. Given 
scalars a, ft, c such that aA + bB + cC = 0, we take the dot product of each member with 
C and use the relations A.C=2?-C = 0to find c = 0. This gives aA + bB = 0, so 
a = ft = 0 since A and B are independent. This proves (a). 

Let N be any vector orthogonal to both A and B, and let C = A x B. We shall prove 
that 

(. N ■ C) 2 = (N- N)(C ■ C). 

Then from the Cauchy-Schwarz inequality (Theorem 12.3) it follows that N is a scalar 
multiple of C. 

Since A B, and C are linearly independent, we know, by Theorem 12.10(c), that they 
span V 3 ■ In particular, they span N, so we can write 

N = aA + bB + cC 
for some scalars a, ft, c. This gives us 


N • N = N ■ (aA + bB + cC) = c N • C 
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since N • A = N < B = 0. Also, since Q • A = C 1 B = 0, we have 
C ■ N = C ■ (aA + bB + cC) = cC ■ C . 

Therefore, (N ■ N)(C ■ C) = (cN < C)(C. C) = (N - C)(cC • C) = (N . C) 2 , which completes 
the proof. 

Theorem 13.12 helps us visualize the cross product geometrically. From properties (d) 
and (e), we know that A x B is perpendicular to both A and B. When the vector A x B is 
represented geometrically by an arrow, the direction of the arrow depends on the relative 



(a) A right-handed coordinate system (b) A left-handed coordinate system 


Figure 13.4 Illustrating the relative positions of A, B. and A x B. 

positions of the three unit coordinate vectors. If i, j, and k are arranged as shown in Figure 
13.4(a), they are said to form a right-handed coordinate system. In this case, the direction of 
A x B is determined by the “right-hand rule.” That is to say, when A is rotated into B 
in such a way that the fingers of the right hand point in the direction of rotation, then the 
thumb indicates the direction of A x B (assuming, for the sake of the discussion, that the 
thumb is perpendicular to the other fingers). In a left-handed coordinate system, as shown 
in Figure 13.4(b), the direction of A x B is reversed and may be determined by a corre- 
sponding left-hand rule. 

The length of A x B has an interesting geometric interpretation. If A and B are nonzero 
vectors making an angle 0 with each other, where 0 < 6 < 77 , we may write A . B = 

|| A || || B || cos 6 in property (f) of Theorem 13.12 to obtain 

\\A x B || 2 = \\A || 2 ||5|| 2 (1 - cos 2 6) = M|| 2 ||5|| 2 sin 2 6 , 

from which we find 

M x 5|| =\\A || ||5|| sin 6 . 

Since II B II sin 8 is the altitude of the parallelogram determined by A and B (see Figure 13.5), 
we see that the length of A x B is equal to the area of this parallelogram. 
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Area = || A xfl|| 

4 

Figure 13.5 The length of A x B is the area of the parallelogram determined by A and B. 

13.10 The cross product expressed as a determinant 

The formula which defines the cross product can be put in a more compact form with the 
aid of determinants. If a , b, c, d are four numbers, the difference ad — be is often denoted 
by the symbol 

a b 
c d 

and is called a determinant (of order two). The numbers a, b, c, d are called its elements, 
and they are said to be arranged in two horizontal rows, a, b and c, d, and in two vertical 
columns, a, c and b, d. Note that an interchange of two rows or of two columns only changes 
the sign of the determinant. For example, since ad — be = -(be — ad), we have 



a b 


b a 

c d 


d c 


If we express each of the components of the cross product as a determinant of order two, 
the formula defining A x B becomes 


A X B = 


a-i a 3 

b 2 b 3 


a 3 a 1 
b 3 by 


#1 Q<l 

by b 2 


This can also be expressed in terms of the unit coordinate vectors /, j, k as follows: 


(13.7) 


A X B = 


b 2 



a 3 

^3 



*1 

by 


&2 

b. 


k . 


Determinants of order three are written with three rows and three columns and they may 
be defined in terms of second-order determinants by the formula 


(13.8) 


a x a 2 a 3 






^3 


by b 3 


by b 2 

by b 2 b 3 

= ay 


- a 2 


+ a 3 




C 2 C 2 


Cl c 3 


Cy c 2 

c i c 2 c 3 





This is said to be an “expansion” of the determinant along its first row. Note that the 
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determinant on the right that multiplies a, may be obtained from that on the left by deleting 
the row and column in which a, appears. The other two determinants on the right are 
obtained similarly. 

Determinants of order greater than three are discussed in Volume II. Our only purpose 
in introducing determinants of order two and three at this stage is to have a useful device 
for writing certain formulas in a compact form that makes them easier to remember. 

Determinants are meaningful if the elements in the first row are vectors. For example, 
if we write the determinant 

i j k 

a l a 2 a 3 

b\ b 2 b 3 

and “expand” this according to the rule prescribed in (13.8), we find that the result is equal 
to the right member of (13.7). In other words, we may write the definition of the cross 
product Ax B in the following compact form: 

i j k 

A x B = a 1 a 2 a 3 

b\ b 2 b 3 

For example, to compute the cross product of A = 2i — 8 j + 3k and B = 4j + 3k, we 
write 

I i j k 

g ^ 2 3 2 8 

A x B = 2 —8 3 = i — j + k = —36 i — 6/ + 8£ 

4 3 0 3 0 4 

0 4 3 

13.11 Exercises 

1. Let A = -i + 2k, B = 2i + i — k, C = i + 2j + 2k. Compute each of the following 

vectors in terms of i, j, k: 

(a) A x B; (d) A x (C x A); (g) (A X C) X B; 

(b) B x C; (e) (A x B) x C; (h) (A + B) x (A - C); 

(c) c x A; (f) A x (B X C); (i) (A x B) x (A x C). 

2. In each case find a vector of length 1 in V 3 orthogonal to both A and B: 

(a) A=i+j + k, B = 2i + 3j — k; 

(b) A = 2i — 3j + 4 k, B = -i + 5 j + 7k; 

(c) A = i — 2j + 3k, B = -3i + 2j - k. 

3. In each case use the cross product to compute the area of the triangle with vert ices A, B, C: 

(a) A = (0,2,2), B = (2, 0, -1), c =(3,4,0); 

(b) A= (-2,3, 1), *=(1,-3, 4), c= (1,2,1); 

(c) A = (0, 0, 0), *=(0,1,1), c= (1,0,1). 

4. If A = 2i + 5 j +3 k,B = 2i + 7j + 4 k, and C = 3i + 3j + 6k, express the cross product 
(A — C) x (B — A) in terms of /, j, k. 

5. Prove that || A x B || = §A||| ||*|| if and only if A and B are orthogonal. 

6. Given two linearly independent vectors A and B in V 3 . Let C = (B x A) — B. 

(a) Prove that A is orthogonal to B + C. 
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(b) Prove that the angle 0 between B and C satisfies < 6 < tt, 

(c) If = 1 and ||5 x A\\ = 2, compute the length of C. 

7. Let A and B be two orthogonal vectors in V 3 , each having length 1 . 

(a) Prove that A, B. A x B is an orthonormal basis for V 3 . 

(b) Let C = (A x B) x A. Prove that ||C|| = L 

(c) Draw a figure showing the geometric relation between A, B, and A xB, and use this 
figure to obtain the relations 

(A xB) x A = B, (A x B) x B = -A. 

(d) Prove the relations in part (c) algebraically. 

8. (a) If A x B = 0 and A . B = 0, then at least one of A or B is zero. Prove this statement 
and give its geometric interpretation. 

(b) Given A^O. If A x5 = A x C and A . B = A . C, prove that B = C. 

9. Let A =2 i —j + 2k and C = 3/ + 4 / — k. 

(a) Find a vector B such that A x B = C. Is there more than one solution? 

(b) Find a vector B such that A xB = C and A . B = 1. Is there more than one solution? 

10. Given a nonzero vector A and a vector C orthogonal to A, both vectors in V 3 . Prove that there 
is exactly one vector B such that A xB = C and A . B = 1. 

11. Three vertices of a parallelogram are at the points A = (1, 0, 1), B = (—1, 1, 1), C = 
(2, -1,2). 

(a) Find all possible points D which can be the fourth vertex of the parallelogram. 

(b) Compute the area of triangle ABC. 

12. Given two nonparallel vectors A and B in V 3 with A . B = 2, ||A|| = 1, ||5|| = 4. Let C = 
2 (A xB) 3B. Compute A . (B + C), [|C|[, and the cosine of the angle 6 between B and C. 

13. Given two linearly independent vectors A and B in V 3 • Determine whether each of the follow- 
ing statements is true or false. 

(a ) A + B, A — B, A x B are linearly independent. 

(b) A + B, A -j- (A x B). B + (A x B) are linearly independent. 

(c ) A. B, (A + B) x (A — B) are linearly independent. 

14. (a) Prove that three vectors A, B, C in Y, lie on a line if and only if (B — A) x (C A) = 0. 

(b) If A B. prove that the line through A and B consists of the set of all vectors P such 
that (P - A) x (P ~B) = 0. 

15. Given two orthogonal vectors A, B in V 3 , each of length 1. Let P be a vector satisfying the 
equation P xB = A — P. Prove each of the following statements. 

(a) P is orthogonal to B and has length \\^2. 

(b) P, B, P xB form a basis for V 3 . 

(c) (P x B) x B = -P. 

(d) P = \A - \{A x B). 


13.12 The scalar triple product 


The dot and cross products can be combined to form the scalar tripleproduct A . B x C, 
which can only mean A 1 (B x C). Since this is a dot product of two vectors, its value is a 
scalar. We can compute this scalar by means of determinants. Write A = (a,, a 3 , af), 
B = (b, , b» , b,), C = (Cj , C 2 » C 3 ) and express B x C according to Equation (13.7). Forming 
the dot product with A, we obtain 





a i &2 ^3 

b 2 b% 


b$ b i 


b\ 




+ o 2 


+ « 3 


= 

b\ bt£ b% 

C 2 c 3 


C 3 G 


G G 






Ci c 2 c 3 
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Thus, A 1 B x C is equal to the determinant whose rows are the components of the factors 
A, B, and C. 

In Theorem 13.12 we found that two vectors A and B are linearly dependent if and only 
if their cross product A x B is the zero vector. The next theorem gives a corresponding 
criterion for linear dependence of three vectors. 


theorem 13.14. Three vectors A, B, C in V 3 are linearly dependent if and only if 

A- B x C=0. 

Proof. Assume first that A, B, and C are dependent. If B and C are dependent, then 
B x C = 0, and hence A . B x C = 0. Suppose, then, that B and C are independent. 
Since all three are dependent, there exist scalars a, b, c, not all zero, such that aA + bB + 
cC = 0. We must have a ^ 0 in this relation, otherwise B and C would be dependent. 
Therefore, we can divide by a and express A as a linear combination of B and C, say A = 
tB + sC. Taking the dot product of each member with B x C, we find 

A-{BxC) = tB-BxC + sC-BxC = 0, 

since each of B and C is orthogonal to B x C. Therefore dependence of A, B, and C 
implies A . B x C = 0. 

To prove the converse, assume that A . B x C = 0. If B and C are dependent, then so 
are A, B, and C, and there is nothing more to prove. Assume then, that B and C are linearly 
independent. Then, by Theorem 13.13, the three vectors B, C, and B x C are linearly 
independent. Hence, they span A so we can write 

A — qB + bC "I - c(B X C) 

for some scalars a, b, c. Taking the dot product of each member with B x C and using the 
fact that A < (B x C) = 0, we find c = 0, so A = aB + bC. This proves that A, B, and C 
are linearly dependent. 


example. To determine whether the three vectors (2, 3, — 1), (3, -7, 5), and (1, -5, 2) 
are dependent, we form their scalar triple product, expressing it as the determinant 


2 

3 

1 


3 

-7 

-5 


-1 

5 

2 


= 2(— 14 + 25) - 3(6 - 5) - 1(— 15 + 7) = 27 . 


Since the scalar triple product is nonzero, the vectors are linearly independent. 

The scalar triple product has an interesting geometric interpretation. Figure 13.6 shows 
a parallelepiped determined by three geometric vectors A, B, C not in the same plane. Its 
altitude is ||C|| cos cf), where f is the angle between A x B and C. In this figure, cos cf> 
is positive because 0 < <j) < \n. The area of the parallelogram which forms the base is 
|| A x B\\, and this is also the area of each cross section parallel to the base. Integrating 
the cross-sectional area from 0 to || C|| COS <f>, we find that the volume of the parallelepiped 
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is || A x 5|| (I! C || cos <f>), the area of the base times the altitude. But we have 

\\A x B\\ (||C|| cos <f) = (A x B) ■ C . 

In other words, the scalar triple product A xB . C is equal to the volume of the parallele- 
piped determined by A, B, C. When \tt < (f> < 77 , COS </> is negative and the product 
A x B • C is the negative of the volume. If A, B, C are on a plane through the origin, they 
are linearly dependent and their scalar triple product is zero. In this case, the parallelepiped 
degenerates and has zero volume. 


AxB 



Figure 13.6 Geometric interpretation of the scalar triple product as the volume of 

a parallelepiped. 


This geometric interpretation of the scalar triple product suggests certain algebraic 
properties of this product. For example, a cyclic permutation of the vectors A, B, C 
leaves the scalar triple product unchanged. By this we mean that 

(13.9) A x B ■ C = B x C ■ A = C x A ■ B . 

An algebraic proof of this property is outlined in Exercise 7 of Section 13.14. This property 
implies that the dot and cross are interchangeable in a scalar triple product. In fact, the 
commutativity of the dot product implies (B x C) . A = A . (B x C) and when this is 
combined with the first equation in (13.9), we find that 

(13.10) A X B ■ C = A ■ B X C . 

The scalar triple product A • B x C is often denoted by the symbol [ABC] without indi- 
cating the dot or cross. Because of Equation (13. 10), there is no ambiguity in this notation — 
the product depends only on the order of the factors A, B, C and not on the positions of 
the dot and cross. 

13.13 Cramer’s rule for solving a system of three linear equations 

The scalar triple product may be used to solve a system of three simultaneous linear 
equations in three unknowns x, y, z. Suppose the system is written in the form 

a x x + b x y + cyz = d 1 , 
a o\ |- b.,y + c oZ — do , 

Q$X + b^y + C3Z = d ;i . 


(13.11) 
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Let A be the vector with components a , , a 2 , fl 3 and define B, C, and D similarly. Then the 
three equations in (13.11) are equivalent to the single vector equation 


(13.12) 


xA + yB + zC — D . 


If we dot multiply both sides of this equation with B x C, writing [ABC] for A . B x 
we find that 


.y [ABC] + y[BBC ] + z[CBC ] = [DBC] . 


C, 


Since [BBC] = [CBC] = 0, the coefficients of y and z drop out and we obtain 


(13.13) 


= [DBC] 
[ABC] 


if [ABC] ^ 0. 


A similar argument yields analogous formulas for y and z. Thus we have 


(13.14) 


and 

= [ABC] 


[ABD] 

[ABC] 


if [ABC] ^ 0. 


The condition [ABC] ^0 means that the three vectors A, B, C are linearly independent. 
In this case, (13.12) shows that every vector D in 3-space is spanned by A, B, C and the 
multipliers x, y, z are uniquely determined by the formulas in (13.13) and (13.14). When 
the scalar triple products that occur in these formulas are written as determinants, the 
result is known as Cramer’s rule for solving the system (13.11) : 


d x 

h 



fli 

d\ 

Cl 



bi 

dx 

d 2 

b 2 




d 2 

c 2 


@2 

b 2 

d 2 


b 3 



o 3 

d 3 

c 3 



b 3 

d$ 


bi 

ft 

■> y ~ 


bi 

Cl 

3 Z — 

fll 

bi 

Cl 


bi 

C 2 



b 3 

C-2 


a 2 

b 2 

c a 


b 3 

<3 


«3 

b 3 

C 3 


«3 

b 3 

c 3 


If [ABC] = 0, then A, B, C lie on a plane through the origin and the system has no 
solution unless D lies in the same plane. In this latter case, it is easy to show that there are 
infinitely many solutions of the system. In fact, the vectors A, B, C are linearly dependent 
so there exist scalars u, V, w not all zero such that uA + vB + u'C = 0. If the triple (x, y, z) 
satisfies (13.12) then so does the triple (x + tu, y + tu, z + tw) for all real t, since we have 


(x + tu)A + (y + tv)B + (z -)- tw)C 

— xA + yB + zC + t(uA + vB + wC ) — xA + yB + zC . 


13.14 Exercises 

1. Compute the scalar triple product A . B x C in each case. 

(a) A = (3, 0, 0), B = (0, 4, 0), c = (0, 0, 8). 

(b) A = (2, 3, -1), B = (3, -7, 5), c = (1, -5, 2). 

(c) A = (2, 1, 3), B= (-3,0,6), c = (4, 5,-1). 
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2. Find all real t for which the three vectors (1, t, 1),(/, 1, 0), (0, 1, t) are linearly dependent. 

3. Compute the volume of the parallelepiped determined by the vectors /' + j j + k, k -f- 

4. Prove that A x B = A • (B x /)/ + A ■ (B x j)j + A ■ (B x k)k. 

5. Prove that i x (A x j) + j x (A x J) + k x (A x k) = 2 A. 

6. (a) Find all vectors ai + bj + ck which satisfy the relation 

(ai + bj + ck) < k x (6i + 3j + 4k) = 3 . 

(b) Find that vector ai + bj + ck of shortest length which satisfies the relation in (a). 

7. Use algebraic properties of the dot and cross products to derive the following properties of 
the scalar triple product. 

(a) (A + B) . (A + B) x C = 0. 

(b) A B x C = -B . A x C. This shows that switching the first two vectors reverses the 

sign. {Hint: Use part (a) and distributive laws.] 

(c) A . B x C = -A . C x B. This shows that switching the second and third vectors 
reverses the sign. [Hint: Use skew-symmetry.] 

(d) A . B x C = — C ■ B x A. This shows that switching the first and third vectors reverses 
the sign. [Hint: Use (b) and (c).] 

Equating the right members of(b), (c), and (d), we find that 

A'B x C = B-C x A =C-A x B, 

which shows that a cyclic permutation of A, B, C leaves their scalar triple product unchanged. 

9. This exercise outlines a proof of the vector identity 

(13.15) A x (B x C) = (C A)B - (B . A)C , 

sometimes referred to as the “cab minus bac” formula. Let B = (b, , b% , b 3 ), C = (cj ■ c 2 » c 3 ) 
and prove that 

i x (S x C) = c x B — b x C . 

This proves (13.15) in the special case A = i. Prove corresponding formulas for A = j and 
A = k, and then combine them to obtain (13.15). 

10. Use the “cab minus bac” formula of Exercise 9 to derive the following vector identities. 

(a) (A x B) x (C x D) = (A x B D)C - (Ax B- C)D. 

(b) A x (B x C) + B x (C x A) + C x (A x B) = 0. 

(c ) A x (B x C)=(A x B) x C if and only if B x (C x A) =0. 

(d) (A x B) (C x D) = (B . D)(A C) - (B . C)(A . D). 

1 1 . Four vectors A, B, C, D in V 3 satisfy the relations A x C B = 5, A x D . B = 3, C + D = 

i + 2j + k, C — D = i — At. Compute (A x B ) x (C x D)intermsofi,j,k. 

12. Prove that (A x B) . (B x C) x (C x A) = (A . B x Cf. 

13. Prove or disprove the formula A x [A x (A x B) ] . C = — ]|xf|| 2 A . B x C. 

14. (a) Prove that the volume of the tetrahedron whose vertices are A, B, C, D is 

I \{B — A) • (C — A) x (D - A ) | • 

(b) Compute this volume when A = (1, 1, l) j B = (0, 0, 2), C = (0, 3, 0), and D = (4, 0, 0). 

15. (a) If BAC, prove that the perpendicular distance from A to the line through B and C is 

\\(A - B) x(C - B)\\j\\B - C|| . 

(b) Compute this distance when A = (1, -2, —5), B = (-1, 1, 1), and C = (4, 5, 1). 
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16. Heron’s formula for computing the area S of a triangle whose sides have lengths a, b, c states 
that S = y/ s{s — a )( s ~ t>){s — c), where s = (a + b + c)l 2. This exercise outlines a 
vectorial proof of this formula. 

Assume the triangle has vertices at 0, A, and fi, with ||/1|| = a, |[fi|j = b, ||fi — A || = c. 
(a) Combine the two identities 

| \A X fi|| 2 = Mfllfill 2 - (A ' Bf, -2A . B = || A - B\\ 2 - \\A\\ 2 - ||fi|| 2 
to obtain the formula 


4S 2 = a 2 b 2 - i (c 2 -a 2 - b 2 ) 2 = 4(2 ab - c 2 + a 2 + b 2 )(2ab + c 2 - a 2 - b 2 ) . 

(b) Rewrite the formula in part (a) to obtain 

S 2 = is(fi + b + c)(a + b — c){c « a + b){c + a — b) , 

and thereby deduce Heron’s formula. 

Use Cramer’s rule to solve the system of equations in each of Exercises 17, 18, and 19. 

17. x + 2y + 3z = 5, 2x -y + 4z = 11, -y +z = 3. 

18. x +y + 2z = 4, 3 x - y - z = 2 , 2x + 5y + 3z = 3. 

19. x + y = 5, x + z = 2, y + z = 5. 

20. If P = (1, 1, 1) and A = (2, 1, —1), prove that each point (x, y, z) on the line (P + tAj 
satisfies the system of linear equations x — y + z=l, x + y + 3z = 5, 3x + y + 7z = 11. 


13.15 Normal vectors to planes 

A plane was defined in Section 13.6 as a set of the form {P + sA tB }, where A and fi 
are linearly independent vectors. Now we show that planes in K 3 can be described in an 
entirely different way, using the concept of a normal vector. 


definition. Let M = (P + sA + tB} be the plane through P spanned by A and B. A 
vector N in V 3 is said to be perpendicular to M if N is perpendicular to both A and B. 1 1 , ii 

addition , N is nonzero, then N is called a normal vector to the plane. 

Note: If N ■ A = N ■ B = 0, then N ■ (sA + tB) =0, SO a vector perpendicular to 
both A and B is perpendicular to every vector in the linear span of A and fi. Also, if 
N is normal to a plane, so is tN for every real t A 0. 

theorem 13.15. Given a plane M = (P + 5/1+ tB} through P spanned by A and B. 
Let N = A x B. Then we have the following: 

(a) N is a normal vector to M. 

(b) M is the set of all X in V 3 satisfying the equation 

(13.16) (X-P)'N= 0. 

Proof Since M is a plane, A and B are linearly independent, so A x B ^ 0. This 
proves (a) since A x fi is orthogonal to both A and B. 
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To prove (b), let M' be the set of all X in V 3 satisfying Equation (13.16). If X e M, then 
X — P is in the linear span of A and B, so X — P is orthogonal to N. Therefore Xe M' 
which proves that M £ M'. Conversely, suppose Xe M'. Then X satisfies (13.16). Since 
A, B, N are linearly independent (Theorem 13.13), they span every vector in V 3 so, in 
particular, we have 

X — P = sA + tB + uN 

for some scalars s, t, u. Taking the dot product of each member with N, we find u = 0, 
SO X — P = sA + tB. This shows that Xs M. Hence, M' c M, which completes the 
proof of(b). 

The geometric meaning of Theorem 13.15 is shown in Figure 13.7. The points P and X 
are on the plane and the normal vector N is orthogonal to X — P. This figure suggests the 
following theorem. 


to M, let 
(13.17) 


13.16. Given a plane M through a point P, and given a nonzero vector N normal 


d = 


1 P ■ N 1 

II Nil 


Then every X on M has length |] Z|| > d. Moreover, we have | A'H = d if and only if X is the 
projection of P along N: 

P ■ N 

X = tN , where t = . 

N ■ N 

Proof. The proof follows from the Cauchy-Schwarz inequality in exactly the same 
way as we proved Theorem 13.6, the corresponding result for lines in V 2 . 

By the same argument we find that if Q is a point not on M, then among all points X 
on M the smallest length || X — Q || occurs when X — Q is the projection of P <- Q along 
N. This minimum length is \(P — Q) . N\j\\N\ and is called the distance from Q to the 
plane. The number d in (13.17) is the distance from the origin to the plane. 


13.16 Linear Cartesian equations for planes 

The results of Theorems 13.15 and 13.16 can also be expressed in terms of components. 
If we write N = (a, b, c), P = (x 1 , y i , z,), and X = (x, y, z), Equation (13.16) becomes 

(13.18) a(x — Xj) + b(y jj) + c(z ~ zf = 0. 

This is called a Cartesian equation for the plane, and it is satisfied by those and only those 
points (x, y, z) which lie on the plane. The set of points satisfying (13.18) is not altered if 
we multiply each of a, b, c by a nonzero scalar t. This simply amounts to a different choice 
of normal vector in (13.16). 

We may transpose the terms not involving x, y, and z, and write (13.18) in the form 

(13.19) ax + by + cz = di, 
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where d 1 = ax, + by, + cz 1 . An equation of this type is said to be linear in x, y, and z. 
We have just shown that every point (x, y, z) on a plane satisfies a linear Cartesian equation 
(13.19) in which not all three of a, b, c are zero. Conversely, every linear equation with this 
property represents a plane. (The reader may verify this as an exercise.) 

The number d 1 in Equation (13.19) bears a simple relation to the distance d of the plane 
from the origin. Since d 1 = P 1 N, we have | d t \ = \P ■ NJ = <r/|| TV || . In particular 1^1 = d 
if the normal N has length 1. The plane passes through the origin if and only if d 1 = 0. 



Figure 13.7 A plane through P and 
X with normal vector N. 



Figure 13.8 A plane with intercepts 

3, 1, 2. 


example. The Cartesian equation 2x + 6j + 3z = 6 represents a plane with normal 
vector N = 2/ + 6 j + 3k. We rewrite the Cartesian equation in the form 


3 ^ 1 + 2 


from which it is apparent that the plane intersects the coordinate axes at the points (3, 0, 0), 
(0, 1, 0), and (0, 0, 2). The numbers 3, 1, 2 are called, respectively, the x-, y-, and z- 
intercepts of the plane. A knowledge of the intercepts makes it possible to sketch the plane 
quickly. A portion of the plane is shown in Figure 13.8. Its distance d from the origin is 
d = 6/ 1| (V|| = 6/7. 

Two parallel planes will have a common normal N. If N = (a, b, c), the Cartesian equa- 
tions of two parallel planes can be written as follows: 

ax + by + cz = d 1 , ax + by + cz = d 2 , 

the only difference being in the right-hand members. The number |dj — dol/HAjl is called 
the perpendicular distance between the two planes, a definition suggested by Theorem 13.16. 
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Two planes are called perpendicular if a normal of one is perpendicular to a normal of the 
other. More generally, if the normals of two planes make an angle 9 with each other, then 
we say that 9 is an angle between the two planes. 


13.17 Exercises 

1 . Given vectors A = 2i + 3 j - 4k and B =j + k. 

(a) Find a nonzero vector N perpendicular to both A and B. 

(b) Give a Cartesian equation for the plane through the origin spanned by A and B. 

(c) Give a Cartesian equation for the plane through(l, 2, 3) spanned by A and B. 

2. A plane has Cartesian equation x + 2y — 2z + 7 = 0. Find the following: 

(a) a normal vector of unit length; 

(b) the intercepts of the plane; 

(c) the distance of the plane from the origin; 

(d) the point Q on the plane nearest the origin. 

3. Find a Cartesian equation of the plane which passes through (1, 2, -3) and is parallel to the 
plane given by 3x -y + 2 z = 4. What is the distance between the two planes? 

4. Four planes have Cartesian equations x + 2y — 2z = 5, 3x — 6y + 3z = 2, 2x + y + 2z = 

- 1 , a n d x — 2y + z = 1. 

(a) Show that two of them are parallel and the other two are perpendicular. 

(b) Find the distance between the two parallel planes. 

5. The three points (1, 1, -1), (3, 3,2), and (3, -1, -2) determine a plane. Find (a) a vector 
normal to the plane; (b) a Cartesian equation for the plane; (c) the distance of the plane 
from the origin. 

6. Find a Cartesian equation for the plane determined by (1, 2, 3), (2, 3, 4), and (-1, 7, -2). 

7. Determine an angle between the planes with Cartesian equations x + y = 1 and y + z = 2. 

8. A line parallel to a nonzero vector N is said to be perpendicular to a plane M if N is normal 
to M. Find a Cartesian equation for the plane through (2, 3, —7), given that the line through 
(1, 2, 3) and (2, 4, 12) is perpendicular to this plane. 

9. Find a vector parametric equation for the line which contains the point (2, 1, -3) and is 
perpendicular to the plane given by 4x — 3y + z = 5. 

10. A point moves in space in such a way that at time t its position is given by the vector X(r) = 

(1 - t)i + (2 - 3t)j + (2t - 1 )k. 

(a) Prove that the point moves along a line. (Call itZ,.) 

(b) Find a vector N parallel to L. 

(c) At what time does the point strike the plane given by 2x + 3y + 2z + 1 = O? 

(d) Find a Cartesian equation for that plane parallel to the one in part (c) which contains 
the point X(3). 

(e) Find a Cartesian equation for that plane perpendicular to L which contains the point X(2). 

11. Find a Cartesian equation for the plane through (1, 1, 1) if a normal vector N makes angles 

In, |77, with /J, k, respectively. 

12. Compute the volume of the tetrahedron whose vertices are at the origin and at the points 
where the coordinate axes intersect the plane given by x + 2y-f 3z = 6. 

13. Find a vector A of length 1 perpendicular to / + 2j — 3k and parallel to the plane with 
Cartesian equation x — y + 5z = 1. 

14. Find a Cartesian equation of the plane which is parallel to both vectors / + j and j + k and 
intersects the x-axis at (2, 0, 0). 

15. Find all points which lie on the intersection of the three planes given by 3x + y + z = 5, 
3x + y + 5z = 7, x — y + 3z = 3. 

16. Prove that three planes whose normals are linearly independent intersect in one and only 

one point. 
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17. A line with direction vector A is said to be parallel to a plane M if A is parallel to M. A line 
containing (1, 2, 3) is parallel to each of the planes given by x + 2y -j- 3z = 4, 2x + 3 y + 
4z = 5. Find a vector parametric equation for this line. 

18. Given a line L not parallel to a plane M, prove that the intersection L n M contains exactly 
one point. 

19. (a) Prove that the distance from the point (x 0 , y„ z„) to the plane with Cartesian equation 
ax + by + cz + d = 0 is 

I ax a + by 0 + cz 0 + dl 
(a 2 + b 2 + c 2 ) Fa 

(b) Find the point P on the plane given by 5x ■» 14 y + 2z + 9 = 0 which is nearest to the 
point Q = (-2, 15, -7). 

20. Find a Cartesian equation for the plane parallel to the plane given by 2x — y + 2z + 4 = 0 
if the point (3, 2, -1) is equidistant from both planes. 

21. (a) If three points A, B, C determine a plane, prove that the distance from a point Q to this 
plane is \(Q - A) ■ (B - A) x (C - A)\I\\(B - A) x (C - A)||. 

(b) Compute this distance if Q = (1, 0, 0), A = (0, 1, 1), B = (1, -1, 1), and C = (2, 3, 4). 

22. Prove that if two planes M and M’ are not parallel, their intersection is a line. 

23. Find a Cartesian equation for the plane which is parallel to j and which passes through the 
intersection of the planes described by the equations x + 2y + 3z = 4, and 2x + y + z = 2. 

24. Find a Cartesian equation for the plane parallel to the vector 3 i ~ j + 2k if it contains every 
point on the line of intersection of the planes with equations x + y = 3 and 2 y + 3z = 4. 

13.18 The conic sections 

A moving line G which intersects a fixed line A at a given point P, making a constant 
angle 6 with A, where 0 < 0 < \tt, generates a surface in 3-space called a right circular 
COne, The line G is called a generator of the cone, A is its axis, and P its vertex. Each of the 
cones shown in Figure 13.9 has a vertical axis. The upper and lower portions of the cone 
meeting at the vertex are called nappes of the cone. The curves obtained by slicing the 
cone with a plane not passing through the vertex are called conic sections, or simply conics. 
If the cutting plane is parallel to a line of the cone through the vertex, the conic is called a 



Figure 13.9 The conic sections. 



498 


Applications of vector algebra to analytic geometry 


parabola. Otherwise the intersection is called an ellipse or a hyperbola, according as the 
plane cuts just one or both nappes. (See Figure 13.9.) The hyperbola consists of two 
“branches,” one on each nappe. 

Many important discoveries in both pure and applied mathematics have been related 
to the conic sections. Appolonius’ treatment of conics as early as the 3rd Century r.c. was 
One of the most profound achievements of classical Greek geometry. Nearly 2000 years 
later, Galileo discovered that a projectile fired horizontally from the top of a tower falls 
to earth along a parabolic path (if air resistance is neglected and if the motion takes place 
above a part of the earth that can be regarded as a flat plane). One of the turning points in 
the history of astronomy occurred around 1600 when Kepler suggested that all planets 
move in elliptical orbits. Some 80 years later, Newton was able to demonstrate that an 
elliptical planetary path implies an inverse-square law of gravitational attraction. This led 
Newton to formulate his famous theory of universal gravitation which has often been 
referred to as the greatest scientific discovery ever made. Conic sections appear not only as 
orbits of planets and satellites but also as trajectories of elementary atomic particles. They 
are used in the design of lenses and mirrors, and in architecture. These examples and many 
others show that the importance of the conic sections can hardly be overestimated. 


There are other equivalent definitions of the conic sections. One of these refers to special 
points known as foci (singular: fous ). An ellipse maybe defined as the set of all points in a 
plane the sum of whose distances d l and d 2 from two fixed points F x and F 2 (the foci) is 



d, + d 2 = constant 
(ellipse) 



,d x - d-i | = constant 
(hyperbola) 



figure 13.10 Focal definitions of the conic sections. 

constant. (See Figure 13.10.) If the foci coincide, the ellipse reduces to a circle. A hyper- 
bola is the set of all points for which the difference VK — d 2 \ is constant. A parabola is the 
set of all points in a plane for which the distance to a fixed point F (called the focus) is 
equal to the distance to a given line (called the directrix). 

There is a very simple and elegant argument which shows that the focal property of an 
ellipse is a consequence of its definition as a section of a cone. This proof, which we may 
refer to as the “ice-cream-cane proof,” was discovered in 1822 by a Belgian mathematician, 
G. P. Dandelin (1794-1847), and makes use of the two spheres S 1 and S 2 which are drawn 
so as to be tangent to the cutting plane and the cone, as illustrated in Figure 13.11. These 
spheres touch the cone along two parallel circles Q and C 2 . We shall prove that the points 
F x and F 2 , where the spheres contact the plane, can serve as foci of the ellipse. 
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Let P be an arbitrary point of the ellipse. The problem is to prove that ll-PFill + 
is constant, that is, independent of the choice of P. For this purpose, draw that line on the 
Cone from the vertex 0 to P and let A x and A, be its intersections with the circles C x 
and C 2 , respectively. Then PF X and PA X are two tangents to S x from P. and hence \\PF 1 1| = 
Similarly || PF 2 |j = || PA 2 ||, and therefore we have 

ll^i II + \\PF a \\ = \\PAA + \\PA 2 \\ . 

But \\PA X \\ + || PA 2 || = ||/l 1 /f 2 ||, which is the distance between the parallel circles C x and 
C 2 measured along the surface of the cone. This proves that F x and F 2 Can serve as foci of 
the ellipse, as asserted. 

Modifications of this proof work also for the hyperbola and the parabola. In the case 
of the hyperbola, the proof employs one sphere in each portion of the cone. For the 
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parabola one sphere tangent to the cutting plane at the focus Fis used. This sphere touches 
the COne along a circle which lies in a plane whose intersection with the cutting plane is the 
directrix of the parabola. With these hints the reader should be able to show that the focal 
properties of the hyperbola and parabola may be deduced from their definitions as sections 

of a cone. 


13.19 Eccentricity of conic sections 

Another characteristic property of conic sections involves a concept called eccentricity. 
A conic section can be defined as a curve traced out by a point moving in a plane in such 
a way that the ratio of its distances from a fixed point and a fixed line is constant. This 
constant ratio is called the eccentricity of the curve and is denoted by e. (This should not be 
confused with the Euler number e.) The curve is an ellipse if 0 < e < 1, a parabola if 
e = 1, and a hyperbola if e > 1. The fixed point is called a focus and the fixed line a 
directrix. 

We shall adopt this definition as the basis for our study of the conic sections since it 
permits a simultaneous treatment of all three types of conics and lends itself to the use of 
vector methods. In this discussion it is understood that all points and lines are in the same 
plane. 


definition. Given a line L, a point F not on L, and a positive number e. Let d(X, L) 
denote the distance from a point X to L. The set of all X satisfying the relation 

(13.20) \\X -F\\=e d(X,L) 

is called a conic section with eccentricity e. The conic is called an ellipse if e < 1 ; a parabola 
if e - 1, and a hyperbola if e> 1. 


If N is a vector normal to L and if P is any point on L the distance d( X, L) from any 
point X to L is given by the formula 


d(X, L) = 


l(* -P) N | 

II All 


When N has length 1, this simplifies to d(X,L) = |(W — P) . NI, and the basic equation 

(13.20) for the conic sections becomes 

(13.21) jY_F||=e|(A'-P)'A|. 

The line L separates the plane into two parts which we shall arbitrarily label as “positive” 
and “negative” according to the choice °f N. If (X — P) . N > 0, we say that X is in the 
positive half-plane, and if (X — P) < N < 0, we say that X is in the negative half-plane. 

On the line L itself we have (X — P) - N = 0. In Figure 13.12 the choice of the normal 

vector N dictates that points to the right of L are in the positive half-plane and those to the 
left are in the negative half-plane. 

Now we place the focus F in the negative half-plane, as indicated in Figure 13.12, and 

choose P to be that point on L nearest to F. Then P — F - dN, where \d\ = ||P — F|| is 
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Figure 13.12 A conic section with eccentricity e is the set of all X satisfying 
u - F\\ = e\(X - F)'N - d\. 

the distance from the focus to the directrix. Since F is in the negative half-plane, we have 
(F ~ P ) ' N = -d < 0, so dis positive. Replacing P by F + dN in (13.21), we obtain the 
following theorem, which is illustrated in Figure 13.12. 


theorem 13.17. Let C be a conic section with eccentricity e, focus F, and directrix L 
at a distanced from F. IfN is a unit normal toL and if F is in the negative half-plane deter- 
mined by N, then C consists of all points X satisfying the equation 

(13.22) \\X — f || = e l(X - F) . N - d\ . 


13.20 Polar equations for conic sections 

The equation in Theorem 13.17 can be simplified if we place the focus in a special 
position. For example, if the focus is at the origin the equation becomes 

(13.23) \\X\\ = e\X- N - dl. 

This form is especially useful if we wish to express X in terms of polar coordinates. Take 
the directrix L to be vertical, as shown in Figure 13.13, and let N = /. If X has polar co- 
ordinates r and 8, we have ||A r || = r, X < N = r cos 8, and Equation (13.23) becomes 

(13.24) r = e |r cos 0 — d\ . 

If X lies to the left of the directrix, we have r cos 0 < d, SO \r COS 6 — d\ = d — r COS 6 
and (13.24) becomes r = e(d — r cos 8), or, solving for r, we obtain 

ed 

e cos 6 + 1 


(t3.25) 


r 
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If X lies to the right of the directrix, we have r cos 6 > d, so (13.24) becomes 


giving us 
(13.26) 


r - e(r cos 6 — d) , 
ed 

r = 

e cos 0 — 1 ' 


Since r > 0 , this last equation implies e > 1. In other words, there are points to the right 
of the directrix only for the hyperbola. Thus, we have proved the following theorem which 
is illustrated in Figure 13.13. 




(a) r cos D < don the ellipse, parabola, 
and left branch of the hyperbola 


(b) rcos 0>rfon the right branch of 
the hyperbola 


Figure 13.13 Conic sections with polar equation r = e \r COS 0 — d |. The focus F 
is at the origin and lies to the left of the directrix. 


theorem 13.18. Let C be a conic section with eccentricity e, uith a focus F at the origin, 
and with a vertical directrix L at a distance d to the right of F. If 0 fe <, 1, the conic C is 
an ellipse or a parabola; every point on C lies to the left of L and satisfies the polar equation 


(13.27) 


ed 

r = 

e cos 0 + I . 


if e > 1, the curve is a hyperbola with a branch on each side of L. Points on the left branch 
satisfy (13.27) andpoints on the right branch satisfy 


(13.28) 


- ed 

r “ e COS e - 1 ' 


Polar equations corresponding to other positions of the directrix are discussed in the 
next set of exercises. 
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13.21 Exercises 

1. Prove that Equation (13.22) in Theorem 13.17 must be replaced by 

\\X-F\\ = e \{X-F) . N+ d\ 


if Fis in the positive half-plane determined by N. 

2. Let C be a conic section with eccentricity e, with a focus at the origin, and with a vertical 
directrix L at a distance d to the left of F. 

(a) Prove that if C is an ellipse or parabola, every point of C lies to the right of L and satisfies 
the polar equation 

ed 

r = . 

1 - e cos 8 

(b) Prove that if C is a hyperbola, points on the right branch satisfy the equation in part (a) 
and points on the left branch satisfy r = —edj( 1 + e cos 0). Note that 1 + e cos 6 is always 
negative in this case. 

3. If a conic section has a horizontal directrix at a distance d above a focus at the origin, prove 
that its points satisfy the polar equations obtained from those in Theorem 13.18 by replacing 
cos fj by sin 0, What are the corresponding polar equations if the directrix is horizontal and 
lies below the focus? 

Each of Exercises 4 through 9 gives a polar equation for a conic section with a focus Fat the 
origin and a vertical directrix lying to the right of F. In each case, determine the eccentricity e 
and the distance d from the focus to the directrix. Make a sketch showing the relation of the curve 
to its focus and directrix. 


4. r = 

1 + COS0' 

7. r = 

-I + cos e 

5. r = 

3 

8. r = 

4 

1 + JCOS 0 ' 

1+2 cos 0 ' 

6. r = 

6 

9. r = 

4 

3 + cos 6 ' 

1 + cos d ' 


In each of Exercises 10 through 12, a conic section of eccentricity e has a focus at the origin and 
a directrix with the given Cartesian equation. In each case, compute the distance d from the focus 
to the directrix and determine a polar equation for the conic section. For a hyperbola, give a 
polar equation for each branch. Make a sketch showing the relation of the curve to its focus 
and directrix. 

10. e = directrix: 3x +4 y = 25. 

11. e = 1; directrix: 4x + 3_y = 25. 

12. e = 2; directrix: x + y = 1. 

13. A cornet moves in a parabolic orbit with the sun at the focus, When the cornet is JO 6 miles 
from the sun, a vector from the focus to the cornet makes an angle of tt/3 with a unit vector 
N from the focus perpendicular to the directrix, the focus being in the negative half-plane 
determined by N. 

(a) Find a polar equation for the orbit, taking the origin at the focus, and compute the 
smallest distance from the cornet to the sun. 

(b) Solve part (a) if the focus is in the positive half-plane determined by N. 
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13.22 Conic sections symmetric about the origin 

A set of points is said to be symmetric about the origin if -X is in the set whenever X is 
in the set. We show next that the focus of an ellipse or hyperbola can always be placed so 
the conic section will be symmetric about the origin. To do this we rewrite the basic 
equation (13.22) as follows: 

(13.29) \\X -f| = e |CJr-F) -N - d\ = e \X ■ N - F ■ N - d\ = \eX-N-a\, 
where a = ed + eF 1 N. Squaring both members, we obtain 

(13.30) || X\\ 2 -2F-X + ||F|| 2 = e\X ■ N) 2 - 2 eaX ■ N + a 2 . 

If we are to have symmetry about the origin, this equation must also be satisfied when X 
is replaced by —X, giving us 

(13.31) || X\\ 2 + 2F . X + ||F|| 2 = e 2 (X ■ Nf + 2eaX < N + a 2 . 

Subtracting (13.31) from (13.30), we have symmetry if and only if 

F-X=eaX-N o r (F - eaN)- X = 0. 

This equation can be satisfied for all X on the curve if and only if F and N are related by 
the equation 

(13.32) F = eaN , where a = ed + eF N. 

The relation F = eaN implies F < N = ea, giving us a = ed + e 2 a. If e = 1, this last 
equation cannot be satisfied since d, the distance from the focus to the directrix, is nonzero. 
This means there is no symmetry about the origin for a parabola. If e yt. 1, we can always 
satisfy the relations in (13.32) by taking 

(13.33) a - ■ F = - e \ . 

1 - e 2 1 - e 2 

Note that a > 0 if e < 1 and a < 0 if e > 1. Putting F = eaN in (13.30) we obtain the 
following. 


theorem 13.19. Let C be a conic section with eccentricity e ^ 1 and with a focus Fat 
a distance d jrom a directrix L. If N is a unit normal to L and ifF= eaN, where a = 

ed/i 1 — e 2 ), then C is the set of all points X satisfying the equation 


(13.34) 


||X|| 2 + e 2 a 2 = e\X- N) 2 + a 2 . 
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This equation displays the symmetry about the origin since it is unchanged when X is 
replaced by -X. Because of this symmetry, the ellipse and the hyperbola each have two 
foci, symmetrically located about the center, and two directrices, also symmetrically located 
about the center. 

Equation (13.34) is satisfied when X = £aN. These two points are called vertices of the 
conic. The segment joining them is called the major axis if the conic is an ellipse, the 
transverse axis if the conic is a hyperbola. 

Let N’ be a unit vector orthogonal to N. If X = bN', then X 1 N = 0, so Equation (13.34) 
is satisfied by X = bN' if and only if b 2 + ^a 2 = a 2 . This requires e < 1, b 2 - fl 2 (l — e 2 ). 
The segment joining the points X = ±bN', where b = aV 1 — e 2 is called the minor axis 
of the ellipse. 

Note If we put e = 0 in (13.34), it becomes UZI = a, the equation of a circle of radius 
a and center at the origin. In view of (13.33), we can consider such a circle as a limiting 
case of an ellipse in which e -> 0 and d -» <x= in such a way that ed -*■ a. 


13.23 Cartesian equations for the conic sections 

To obtain Cartesian equations for the ellipse and hyperbola, we simply write (13.34) 
in terms of the rectangular coordinates of X. Choose N = / (which means the directrices 
are vertical) and let X = (x, y). Then || X\ 2 = x 2 + y 2 , X 1 N = x, and (13.34) becomes 
x 2 + _y 2 + e 2 a 2 = e 2 x 2 + a 2 , or x 2 (\ — e 2 ) + y 2 = a 2 ( 1 — e 2 ), which gives us 


(13.35) 




1 . 


This Cartesian equation represents both the ellipse (e < 1) and the hyperbola (e > 1) and 
is said to be in Standardform. The foci are at the points (ae, 0) and (-ae, 0); the directrices 
are the vertical lines x = aje and x = —aje. 

If e < 1, we let b = aV 1 — e 2 and write the equation of the ellipse in the standard form 


(13.36) 



Its foci are located at (c, 0) and ( — c, 0), where c = ae = %'a 2 — b 2 . An example is shown 
in Figure 13.14(a). 

If e > 1, we let b = |a| V e 2 — 1 and write the equation of the hyperbola in the standard 
form 


(13.37) 



1 . 


Its foci are at the points (c, 0) and (-c, 0), where c = \a\ e = V a 2 + b 2 . An example is 
shown in Figure 13.14(b). 
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Note: Solving for y in terms of x in (13.37) , we obtain two solutions 


(13.38) 


y = ± — Vx 2 - a 2 

to 


For large positive x, the number V x 2 - a 2 iistearly equal to .v, so the right member of 
(13.38) is nearly ±bxl\a \. It is easy to prove that the difference between y x - and 
y 2 = bV x 2 -- a 2 j\a\ approaches 0 as x ->- + co. This difference is 


A (x _ b * 2 ~ ( * 2 ~ a ' 2) = lalb < ^ 

ja| |a| x + Vx 2 — a 2 x + V x 2 — a 1 x 


so y, — y 2 0 as x -*■ + oo. Therefore, the line y = bxj\a\ is an asymptote of the 
hyperbola. The line y = ~bxj\a\ is another asymptote. The hyperbola is said to ap- 
proach these lines asymptotically. The asymptotes are shown in Figure 13.14(b). 




^ + it = 1 ; b 2 = a 2 - c 2 
d l h l 


i! - £ = 1; b 2 = c 2 - a 2 
a 2 b‘ 


Figure 13.14 Conic sections of eccentricity e # 1, symmetric about the origin. The 
foci are at (fc, 0), where c = |oj e. The triangles relate a, b, c geometrically. 


The Cartesian equation for the ellipse and hyperbola will take a different form if the 
directrices are not vertical. For example, if the directrices are taken to be horizontal, we 
may take N = j in Equation (13.34). Since X ■ N = X ' j = y, we obtain a Cartesian 
equation like (13.35), except that x and y are interchanged. The standard form in this 
case is 


(13.39) 




= 1 . 


If the conic is translated by adding a vector X 0 = (x, , y„) to each of its points, the center 
will be at (x, , y 0 ) instead of at the origin. The corresponding Cartesian equations may be 
obtained from (13.35) or (13.39) by replacing X by % >- x 0 and y by y — y„ . 
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To obtain a Cartesian equation for the parabola, we return to the basic equation (13.20) 
with e = 1. Take the directrix to be the vertical line x = — c and place the focus at (c, 0). 

If X = (x, y), we have X — F = (x — c, y), and Equation (13.20) gives us (x — c) 2 + y 2 = 

\x + c | 2 . This simplifies to the standard form 

(13.40) j.’ 2 = 4cx. 

The point midway between the focus and directrix (the origin in Figure 13.15) is called the 
vertex of the parabola, and the line passing through the vertex and focus is the axis of the 
parabola. The parabola is symmetric about its axis. If c > 0, the parabola lies to the right 
of the y-axis, as in Figure 13.15. When c < 0, the curve lies to the left of the y-axis. 



If the axes are chosen so the focus is on the y-axis at the point (0, c) and if the horizontal 
line y = — c is taken as directrix, the standard form of the Cartesian equation becomes 

X 2 = 4cy. 

When c > 0 the parabola opens upward as shown in Figure 13.16. When c 0, it opens 
downward. 

If the parabola in Figure 13.15 is translated so that its vertex is at the point (x 0 , y„), the 
corresponding equation becomes 


(y - To) 2 = 4 c(x - x 0 ) . 

The focus is now at the point (x„ + c, y 0 ) and the directrix is the line x = x 0 — c. The 
axis of the parabola is the line y = y Q . 

Similarly, a translation of the parabola in Figure 13.16 leads to the equation 

(x - x 0 f = 4 c(y - To) , 

with focus at (x 0 , To + c )- The line y = y„ — c is its directrix, the line x = x 0 its axis. 

The reader may find it amusing to prove that a parabola does not have any asymptotes. 
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13.24 Exercises 


Each of the equations in Exercises 1 through 6 represents an ellipse. Find the coordinates of 
the Center, the foci, and the vertices, and sketch each curve. Also determine the eccentricity. 


1. 

2 . 


V — 

100 36 

y 2 x 2 

TOO + 36 


3, 


{x - If (y + 3) 2 
1 6 + 9 


1 . 


4. 9x2 + 25 y 2 = 25. 

5. 4/ + 3x 2 = 1. 

, {x + l) 2 + (y + 2) 2 

6 . ~~25 ~ 


In each of Exercises 7 through 12, find a Cartesian equation (in the appropriate standard form) 
for the ellipse that satisfies the conditions given. Sketch each curve. 

7. Center at (0, 0), one focus at (f, 0), one vertex at (1,0). 

8. Center at (-3, 4), semiaxes of lengths 4 and 3, major axis parallel to the x-axis. 

9. Same as Exercise 8, except with major axis parallel to the y-axis. 

10. Vertices at ( — 1, 2), (-7, 2), minor axis of length 2. 

11. Vertices at (3, -2), (13, -2), foci at (4, -2), (12, -2). 

12. Center at (2, 1), major axis parallel to the x-axis, the curve passing through the points (6, 1) 
and (2, 3). 

Each of the equations in Exercises 13 through 18 represents a hyperbola. Find the coordinates 
of the center, the foci, and the vertices. Sketch each curve and show the positions of the asymptotes. 
Also, compute the eccentricity. 


13. 

14. 


x 2 

Too 

100 




(x -K 3) 2 -(y - 3) 2 = l, 


16. 9x 2 - 16/ = 144. 

17. 4x 2 - 5y 2 + 20 =0. 

(x - l ) 2 

4 (y -ft 2) 2 

18 - V-- = 1 


In each of Exercises 19 through 23, find a Cartesian equation (in the appropriate standard form) 
for the hyperbola which satisfies the conditions given. Sketch each curve and show the positions 
of the asymptotes. 

19. Center at (0, 0), one focus at (4, 0), one vertex at (2,0). 

20. Foci at (0, ± V 2 ), vertices at (0. ±1). 

21. Vertices at (±2, 0), asymptotes y = ±2x. 

22. Center at ( —1, 4), one focus at ( — 1, 2), one vertex at ( 1, 3). 

23. Center at (2, —3), transverse axis parallel to one of the coordinate axes, the curve passing 

through (3, -1) and (-1,0). 

24. For what value (or values) of C will the line 3x 2_y = C be tangent to the hyperbola 
*2 - 3y 2 = 1 ? 

25. The asymptotes of a hyperbola are the lines 2x — y = 0 and 2x + y = 0. Find a Cartesian 
equation for the curve if it passes through the point (3, - 5). 

Each of the equations in Exercises 26 through 31 represents a parabola. Find the coordinates 
of the vertex, an equation for the directrix, and an equation for the axis. Sketch each of the curves. 

26. _y 2 = -8x. 29. x 2 = 6 y. 

27. f = 3x. 30. x 2 + 8y = 0. 

28. (y - l) 2 = 12x - 6. 31. (x + 2) 2 = 4_y + 9. 
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In each of Exercises 32 through 37, find a Cartesian equation (in appropriate standard form) 
for the parabola that satisfies the conditions given and sketch the curve, 

32. Focus at (0, — J)| equation of directrix, y = -J. 

33. Vertex at (0,0); equation of directrix, x = -2. 

34. Vertex at (—4,3); focus at (-4, 1). 

35. FOCUS at (3, -1); equation of directrix, x = 

36. Axis is parallel to the y-axis; passes through (0, 1), (1, 0), and (2,0). 

37. Axis is parallel to the x-axis; vertex at (1,3); passes through ( — 1,— 1). 

38. Proceeding directly from the focal definition, find a Cartesian equation for the parabola whose 
focus is the origin and whose directrix is the line 2x + y = 10. 


13.25 Miscellaneous exercises on conic sections 

1. Show that the area of the region bounded by the ellipse x 2 ja 2 + y 2 /b 2 = 1 is ab times the 
area of a circle of radius 1 . 

Note: This statement can be proved from general properties of the integral, without 

performing any integrations. 

2. (a) Show that the volume of the solid of revolution generated by rotating the ellipse 
x 2 /a 2 + y 2 jb 2 = 1 about its major axis is ab 2 times the volume of a unit sphere. 

Note: This statement can be proved from general properties of the integral, without 

performing any integrations. 

(b) What is the result if the ellipse is rotated about its minor axis? 

3. Find all positive numbers A and B, A > B, such that the area of the region enclosed by the 
ellipse Ax 2 + By 2 = 3 is equal to the area of the region enclosed by the ellipse 

(A + B)x l + (A - B)f = 3 . 

4. A parabolic arch has a base of length b and altitude h. Determine the area of the region 
bounded by the arch and the base. 

5. The region bounded by the parabola y 2 = 8x and the line x = 2 is rotated about the x-axis. 
Find the volume of the solid of revolution so generated. 

6. Two parabolas having the equations >’ 2 = 2(x — 1) and y 2 n4(j — 2) enclose a plane region R. 

(a) Compute the area of R by integration. 

(b) Find the volume of the solid of revolution generated by revolving R about the x-axis. 

(c) Same as (b), but revolve R about the y-axis. 

7. Find a Cartesian equation for the conic section consisting of all points (x, y) whose distance 
from the point (0, 2) is half the distance from the line y = 8. 

8. Find a Cartesian equation for the parabola whose focus is at the origin and whose directrix 
is the line x + y + 1 = 0. 

9. Find a Cartesian equation for a hyperbola passing through the origin, given that its asymptotes 
are the lines y = 2x + 1 andy = -2x + 3. 

10. (a) For each p > 0, the equation px 2 + (p + 2 )y 2 = p 2 + 2p represents an ellipse. Find 
(in terms Ofp) the eccentricity and the coordinates of the foci. 

(b) Find a Cartesian equation for the hyperbola which has the same foci as the ellipse of 
part (a) and which has eccentricity "\/ 3. 

1 1. In Section 13.22 we proved that a conic symmetric about the origin satisfies the equation 

|| — F || = \eX ' N — a\, where a = ed + eF ■ N. Use this relation to prove that ||X — F\ + 

\\X + F(f - 2a if the conic is an ellipse. In other words, the sum of the distances from any 
point on an ellipse to its foci is constant. 




510 


Applications of vector algebra to analytic geometry 


12. Refer to Exercise 11. Prove that on each branch of a hyperbola the difference [j A' F\\ — 
|| A* + jp|| is constant. 

13. (a) Prove that a similarity transformation (replacing x by tx and y by ty) carries an ellipse 
with center at the origin into another ellipse with the same eccentricity. In other words, 
similar ellipses have the same eccentricity. 

(b) Prove also the converse. That is, if two concentric ellipses have the same eccentricity 
and major axes on the same line, then they are related by a similarity transformation. 

(c) Prove results corresponding to (a) and (b) for hyperbolas. 

14. Use the Cartesian equation which represents all conics of eccentricity e and center at the 
origin to prove that these conics are integral curves of the differential equation y’ = ( e 2 — \)xjy. 

Note: Since this is a homogeneous differential equation (Section 8.25), the set of all 

such conics of eccentricity e is invariant under a similarity transformation. (Compare 
with Exercise 13.) 

15. (a) Prove that the collection of all parabolas is invariant under a similarity transformation. 
That is, a similarity transformation carries a parabola into a parabola. 

(b ) Find all the parabolas similar to y = x 2 . 

16. The line x — y + 4 = 0 is tangent to the parabola y 2 = 16x. Find the point of contact. 

17. (a) Given a 7 s 0. If the two parabolas y 2 = 4 p(x — a) and x 2 = 4 qy are tangent to each other, 
show that the x-coordinate of the point of contact is determined by a alone. 

(b) Find a condition on a, p, and q which expresses the fact that the two parabolas are tangent 
to each other. 

18. Consider the locus of the points P in the plane for which the distance of P from the point 
(2, 3) is equal to the sum of the distances of P from the two coordinate axes. 

(a) Show that the part of this locus which lies in the first quadrant is part of a hyperbola. 
Locate the asymptotes and make a sketch. 

(b) Sketch the graph of the locus in the other quadrants. 

19. Two parabolas have the same point as focus and the same line as axis, but their vertices lie 
on opposite sides of the focus. Prove that the parabolas intersect orthogonally (i.e., their 
tangent lines are perpendicular at the points of intersection). 

20. (a) Prove that the Cartesian equation 


+ 


represents all conics symmetric about the origin with foci at (c, 0) and (-c, 0). 

(b) Keep c fixed and let S denote the set of all such conics obtained as ^varies over all positive 
numbers ^c 2 . Prove that every curve in S satisfies the differential equation 



+ (x 2 - / - c 2 ) 



0 , 


(c) Prove that S is self-orthogonal; that is, the set of all orthogonal trajectories of curves in 
S is S itself. [Hint: Replace y’ by — 1 jy' in the differential equation in (b).] 

21. Show that the locus of the centers of a family of circles, all of which pass through a given 
point and are tangent to a given line, is a parabola. 

22. Show that the locus of the centers of a family of circles, all of which are tangent (externally) 
to a given circle and also to a given straight line, is a parabola. (Exercise 21 can be considered 
to be a special case.) 
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23. (a) A chord of length 8 |c| is drawn perpendicular to the axis of the parabola y 2 = 4 CX. Let 
P and Q be the points where the chord meets the parabola. Show that the vector from 0 to P 
is perpendicular to that from 0 to Q. 

(b) The chord of a parabola drawn through the foCUS and parallel to the directrix is called 
the lotus rectum. Show first that the length of the latus rectum is twice the distance from the 
foCUS to the directrix, and then show that the tangents to the parabola at both ends of the 
latUS rectum intersect the axis of the parabola on the directrix. 

24. Two points P and Q are said to be symmetric with respect to a circle if P and Q are collinear 
with the center, if the center is not between them, and if the product of their distances from the 
center is equal to the square of the radius. Given that Q describes the straight line 
x + 2y — 5 = 0, find the locus of the point P symmetric to Q with respect to the circle 
X 2 + f = 4 . 
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CALCULUS OF VECTOR-VALUED FUNCTIONS 


14.1 Vector-valued functions of a real variable 

This chapter combines vector algebra with the methods of calculus and describes some 
applications to the study of curves and to some problems in mechanics. The concept of a 
vector-valued function is fundamental in this study. 


definition. A function whose domain is a set of real numbers and whose range is a 
subset of n-space V n is called a vector-valuedfunction of a real variable. 


We have encountered such functions in Chapter 13. For example, the line through a 
point P parallel to a nonzero vector A is the range of the vector-valued function X given by 

X(t) = P + tA 

for all real t. 

Vector-valued functions will be denoted by capital letters such as F, G. X, Y, etc., or by 
small bold-face italic letters /, g, etc. The value of a function Fat t is denoted, as usual, by 
F(t). In the examples we shall study, the domain of F will be an interval which may contain 
one or both endpoints or which may be infinite. 

14.2 Algebraic operations. Components 

The usual operations of vector algebra can be applied to combine two vector-valued 
functions or to combine a vector-valued function with a real-valued function. If F and G 
are vector-valued functions, and if u is a real-valued function, all having a common domain, 
we define new functions F + G, uF, and F ■ G by the equations 

(F + G)(t) = F(t) + G(t) , ( uF)(t ) = u{t)F{t) , (F > G)(t) = F(t) . G(t) . 

The sum F + G and the product uF are vector valued, whereas the dot product F • G is 
real valued. If F(t) and G(t) are in 3-space, we can also define the cross product F x G by 
the formula 

(F x G)(t) = F(t) x G(t). 
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The operation of composition may be applied to combine vector-valued functions with 
real-valued functions. For example, if F is a vector-valued function whose domain in- 
cludes the range of a real-valued function u, the composition G = F° u is a new vector- 
valued function defined by the equation 

G(t) = F[u(t )] 


for each t in the domain of u. 

If a function F has its values in V n , then each vector F(t) has n components, and we can 

write 

F(t) = (/i(0, / 2 (/)„. . . ./»(0) . 


Thus, each vector-valued F gives rise to n real-valued functions f , f n whose values at 
t are the components of F(t). We indicate this relation by writing F = (f y , . . . ,f n ), and we 
call f k the kth component of F. 


14.3 Limits, derivatives, and integrals 

The basic concepts of calculus, such as limit, derivative, and integral, can also be extended 
to vector-valued functions. We simply express the vector-valued function in terms of its 
components and perform the operations of calculus on the components. 

definition. If F = (/i , ■ • ,/„) is a vector-valued junction, we define limit, derivative, 
and integral by the equations 

lim F(t) = (lim/jO), • ■ • , lim/„(o) , 

no = (,m, . jm 

[ F(t) dt = (| MO dt, . . . , P/„(0 dt) , 

*>a \J a / 


whenever the components on the right are meaningful. 


We also say that Fis continuous, differentiable, or integrable on an interval if each com- 
ponent of F has the corresponding property on the interval. 

In view of these definitions, it is not surprising to find that many of the theorems on 
limits, continuity, differentiation, and integration of real-valued functions are also valid for 
vector-valued functions. We state some of the theorems that we use in this chapter. 

THEOREM 14.1. If F,G, and u are differentiable on an interval, then SO are F + G, uF, 
and F ■ G, and we have 


(F + GY = F’ + G\ 


(uF)' = u'F + uF\ (F-G)' = F' -G + F- G'. 
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I f F and G have values in V 3 , we also have 

(F XG)’ = F’ X G + F X G’ 

Proof. To indicate the routine: nature of the proofs we discuss the formula for ( uF )' . 
The proofs of the others are similar and are left as exercises for the reader. 

Writing F = (f , , ff we have 

uF =(uf 1 , w/J , ( uF )' = (luff , . . . , («/„)') , 

But the derivative of the kth component of uF is (uf k )' = u'f k + ufl , SO we have 

(uF)' = w'(/i , ,/„) + u(f ; , . . . ,/;)= u'F + uF'. 

The reader should note that the differentiation formulas in Theorem 14.1 are analogous 
to the usual formulas for differentiating a sum or product of real-valued functions. Since 
the cross product is not commutative, one must pay attention to the order of the factors in 
the formula for (F x G)‘. 

The formula for differentiating F • G gives us the following theorem which we shall use 
frequently. 


theorem 14.2. If a vector-valued function is differentiable and has constant length on 
an open interval /, then F ■ F' = 0 on I. In other words, F’(t) is perpendicular to F(t) for 
each t in I. 

Proof. Let gffl = j]F(t)|| 2 = Fit) ■ F(t). By hypothesis, g is constant on /, and hence 
g' = 0 on I. But since g is a dot product, we have g’ = F’ . F + F • F’ = 2F ■ F’. Therefore 
we have F ■ F' = 0, 

The next theorem deals with composite functions. Its proof follows easily from Theorems 
3.5 and 4.2 which contain the corresponding results for real-valued functions. 

theorem 14.3. Let G = F 0 u, where F is vector valued and u is real valued. If u is 
continuous at t and if F is continuous at u(t), then G is continuous at (. If the derivatives 
u’(t) and F'[u{t)\ exist, then G’(t) also exists and is given by the chain rule, 

G’(t) = F'[u{t)}u'{t) . 

If a vector-valued function F is continuous on a closed interval [a, b], then each com- 
ponent is continuous and hence integrable on [a, b], so Fis integrable on [a, b]. The next 
three theorems give basic propert-ies of the integral of vector-valued functions. In each 
case, the proofs follow at once from the corresponding results for integrals of real-valued 
functions. 

theorem 14.4. linearity and additivity. If the vector-valued functions F and G 
are integrable on [a, b], SO is c x F + c.fi for all Cj and c 2 , and we have 


f (cqFW + c a G(t)) dt = Cjf F(t) dt + J G(t ) dt . 

J a Ja da 
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Also, for each c in [a, b], ire have 

fV(o dt=f F(t)dt +r dt. 

d a d a d c 

THEOREM 14.5. FIRST FUNDAMENTAL THEOREM OF CALCULUS. Assume F is Cl VeCtOr- 

vahred function continuous on [a, b /. If c £ [a, b], define the indefinite integral A to be the 
vector-valued function given by 

A (x) =1 F(t) dt if a < x < b . 
v o 

Then A’(x) exists, and we have A’ (x) = F(x) for each x in (a, b). 

THEOREM 14.6. SECOND FUNDAMENTAL THEOREM OF CALCULUS . /I SSUHIC that tile VCCtOr- 

valuedfunction F has a continuous derivative F‘ on an open interval I. Then, for each choice of 
c and x in f we have 

F(x) = F(c) + I’V'fl) dt 

d C, 

The next theorem is an extension of the property cjjj F(t) dt = cF(t) dt, with multipli- 
cation by the scalar c replaced by dot multiplication by a vector C. 

theorem 14 . 7 . If F = ( j \ ... . iffytyegrcbl e on [a, b], then for every vector C = 
(<T . . . • c„) the dot product C ■ Fis integrable on [a, b], and we have 

C-\ b F(t)dz = l” C.F(r)dt. 

Jn J'g 

Proof. Since each component of Fis integrable, we have 

f*b> n C b Cb n Cb 

c ■ F(t) dt = Y c,. I f it) dt= 2 cjf.0 dt = C ■ Fit) dt . 

Jn i=l Ja Ja /= 1 Ja 

Now we use Theorem 14.7 in conjunction with the Cauchy-Schwarz inequality to obtain 
the following important property of integrals of vector-valued functions. 

theorem 14.8. If F and ||Fj| are integrable on [a, b] we have 

(14.1) i b F(t)dt < P [|F(0|| dt . 

da d a 

Proof. Let C = Fit) dt. If C = 0, then (14.1) holds trivially. Assume, then, that 
C 0 and apply Theorem 14.7 to get 


(14.2) 


[|C || 2 = C C = C ' b F{t)dt = Tc ■ F(t) dt . 

da d a 
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Since the dot product C 1 F(t ) is real valued, we have the inequality 

(14.3) J‘C ■ F(t) dt < £ |C • F(t) j dt < JjCII ||F(t)|| dt , 

where in the last step we used the Cauchy-Schwarz inequality, |C ■ F(t) | < ||C|| ||F(t)||. 
Combining (14.2) and (14.3), we get 

lien 2 < ||C|| fV(0H dt. 

J a 


Since |]C|| > 0, we C4I1 divide by ||C|| to get (14.1). 


14.4 Exercises 

Compute the derivatives F"(t) and F"(t) for each of the vector-valued functions in Exercises 1 
through 6. 

1. F(t) = (t. t\ t'\ /'). 4. Fit) = leH + 3 e*j. 

2. F(t) = (cos t, sin 2 1, sin 2t, tan t), 5. F(t ) = cosh t i + sinh 2tj + e~~ Si k. 

1 


3. F(t) = (arcsin t, arccos t). 

7. Let F be the vector-valued function given by 


6. F(t) = log (i + t 2 ) i + arctan tj + 


k. 


It 1 

F w = YTT 2 1 + 1 


+ V 


j + k . 


Prove that the angle between F{t) and F\t) is constant, that is, independent of t. 
Compute the vector-valued integrals in Exercises 8 through 11. 


(f.V/.C 4 ) dt. 


’ir/4 

( 

Jo 


(sin t, CDS t, tan t) dt. 


. f l (_£_ i+ A*. 

J„ \1 + e i 1 + e tJ J 
(te'i + t 2 e l j + te ! k) dt. 


12. Compute A B, where A = 2i —A) + k and B = fj (te 2t i + t cosh 2tj + 2 te~ 2t k) dt. 

13. Given a nonzero vector B and a vector-valued function F such that F(t) B = t for all t, 
and such that the angle between F'(t) and 8 is constant (independent of t) . Prove that f~ (tj 
is orthogonal to F'(t). 

14. Given fixed nonzero vectors A and 8 , let F(t) = e 2t A + e~ 2t B. Prove that F"(t) has the same 
direction as F(t) . 

15. If G = F x F\ compute G’ in terms of F and derivatives of F. 

16. If G = F- F' x F", prove that G’ = F ■ F' x F". 

17. Prove that lim^j, F(t) = A if and only if lim^j, ||F(/) — Af\ = 0. 

18. Prove that a vector-valued function F is differentiable on an open interval I if and only if 
for each t in Z we have 

F”(t) = lim \ [F(t + h) - F(t)] . 

7i— a" 

19. Prove the zero-derivative theorem f° r vector-valued functions. If f'( t) = 0 for each t in an 
Open interval/, then there is a vector C such that F(t) = C for all t in Z. 
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20. Given fixed vectors A and B and a vector-valued function F such that F”(t) = tA + B, 
determine F(t) if F(0) = D and F’(0) = C. 

21. A differential equation of the form Y’(x) -f p(x) Y(x) = Q(x), where p is a given real-valued 
function, Q a given vector-valued function, and Y an unknown vector-valued function, is called 
a first-order linear vector differential equation. Prove that if p and Q are continuous on an 
interval I, then for each a in Z and each vector B there is one and only one solution Y which 
satisfies the initial condition Y(a) = B. and that this solution is given by the formula 

Y(t)= Be- 9H) P Q(x)e'‘ {x ‘ dx , 

where q(x) = § x a p(t) dt. 

22. A vector- valued function F satisfies the equation tF'(t) = F(t ) + tA for each 
is a fixed vector. Compute F”(l) and F(3) in terms of A, if F(l) = 2A. 

23 . Find a vector-valued function F, continuous on the interval (0, + oo ), such 


t" >0, where A 
that 


1 

F(x) = xe x A + - 
x 


, F{t)dt, 


for all x > 0, where A is a fixed nonzero vector. 

24. A vector-valued function F, which is never zero and has a continuous derivative F’(t) for 
all t, is always parallel to its derivative. Prove that there is a constant vector A and a positive 
real-valued function u such that F(t) = u(t)A for all t. 


14.5 Applications to curves. Tangency 

Let X be a vector-valued function whose domain is an interval /. As t runs through / 
the corresponding function values X(t) run through a set of points which we call the 
graph of the function X. If the function values are in 2-space or in 3-space, we can visualize 
the graph geometrically. For example, if X(t) = P + tA, where P and A are fixed vectors 
in V 3 , with A ^ 0, the graph of X is a straight line through P parallel to A. A more general 
function will trace out a more general graph, as suggested by the example in Figure 14.1. 

If X is continuous on I , such a graph is called a curve; more specifically, the curve described 
by X. Sometimes we say that the curve is described parametrically by X. The interval I 
is called a parametric interval; each t in / is called a parameter. 

Properties of the function X can be used to investigate geometric properties of its graph. 

In particular, the derivative X’ is related to the concept of tangency, as in the case of a 
real-valued function. We form the difference quotient 

(14.4) X(t + h) ~ X(t) 

h 

and investigate its behavior as h — » 0. This quotient is the product of the vector X{t + h) -* 
X(t) by the scalar 1 //;. The numerator, X(t + h) — X(t), illustrated geometrically in 
Figure 14.2, is parallel to the vector in (14.4). If we express this difference quotient in 
terms of its components and let h — >■ 0, we find that 


lim 

■^0 


X(t + h) 
h 


X(t) 


x\t ) , 
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Figure 14.1 A curve traced out by a Figure 14.2 The vector X(t + h) — X(t) is 

vector X(f). parallel to [X(i + h) — X(t)]!h. 

assuming that the derivative X'(t) exists. The geometric interpretation of this relation 
suggests the following definition. 


definition. Let C be a curve described by a continuous vector-valued function X. I f 
the derivative X’(t) exists and is nonzero, the straight line through X(t) parallel to X( t) is 
called the tangent line to C at X(t)., The vector X’(t) is called a tangent vector to C at X(t). 

example 1. Straight line. For a line given by X(t) = P + lA, where A A 0, we have 
X’(t) = A, so the tangent line at each point coincides with the graph of X, a property which 
we surely want. 

example 2. Circle. If X describes a circle of radius a and center at a point P, then 
|j X(t) — P\\ = a for each t. The vector X(t) — P is called a radius cector; it may be repre- 
sented geometrically by an arrow from the center to the point X(t). Since the radius vector 
has constant length, Theorem 14.2 tells us that it is perpendicular to its derivative and hence 
perpendicular to the tangent line. Thus, for a circle, our definition of tangency agrees 
with that given in elementary plane geometry. 

example 3. Invariance under a change of parameter. Different functions can have the 
same graph. For example, suppose that I is a continuous vector-valued function defined 
on an interval 1 and suppose that u is a real-valued function that is differentiable with u’ 
never zero on an interval J, and such that the range of u is /. Then the function Y defined 
on J by the equation 

no = x[u( t )} 

is a continuous vector-valued function having the same graph as X. Two functions X 
and Y so related are called equivalent. They are said to provide different parametric 
representations of the same curve. The function u is said to define a change of parameter. 

The most important geometric concepts associated with a curve are those that remain 
invariant under a change of parameter. For example, it is easy to prove that the tangent 
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line is invariant. If the derivative X'[u(t)] exists, the chain rule shows that Y’(t) also exists 
and is given by the formula 


HO = X'[u(t)]u’(t) . 

The derivative u(t) is never zero. If X'[u{t)] is nonzero, then Y’(t) is also nonzero, so Y’(t) 
is parallel to A"[u(0]. Therefore both representations X and Y lead to the same tangent 
line at each point of the curve. 


example 4. Reflection properties of the conic sections. Conic sections have reflection 
properties often used in the design of optical and acoustical equipment. Light rays emanat- 
ing from one focus of an elliptical reflector will converge at the other focus, as shown in 



Figure 



14.3 Reflection properties of the conic sections. 


Figure 14.3(a). Light rays directed toward one focus of a hyperbolic reflector will con- 
verge at the other focus, as suggested by Figure 14.3(b). In a parabolic reflector, light rays 
parallel to the axis converge at the focus, as shown in Figure 14.3(c). T£> establish these 
reflection properties, we need to prove that in each figure the angles labeled 0 are equal. We 
shall do this for the ellipse and hyperbola and ask the reader to give a proof for the parabola. 

Place one focus F l at the origin and let «, and be unit vectors having the same directions 
as X and X — F 2 , respectively, where X is an arbitrary point on the conic. (See Figure 
14.4.) If cl 1 = ||Y|| and d 2 - || X -» F 2 || are the focal distances between X and the foci 
F 1 and F 2 , respectively, we have 

X = d x u y and X - d 2 u 2 + F 2 . 

Now we think of X, u 1 ,u 2 ,d 1 , and cl 2 as functions defined on some interval of real numbers. 
Their derivatives are related by the equations 

(14.5) X' = <fi«l + d[u 1 , x’ = r/ 2 «2 + d' 2 u 2 ■ 

Since U\ and h 2 have constant length, each is perpendicular to its derivative, so Equations 
(14.5) give us X’ . u x = d x and X’ ' U 2 = d 2 . Adding and subtracting these relations, we 
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find that 

(14.6) X' . («! + k 2 ) = d; +d' 2l X' . (u x - „ 2 ) = d; - d' . 

On the ellipse, d 1 + d 2 is constant, so d' x + d 2 = 0. On each branch of the hyperbola, 




(a) = t 6, °n theellipse (b) Q 2 = 8, on the hyperbola 

Figure 14.4 Proofs of the reflection properties for the ellipse and hyperbola. 

dj — d 2 is constant, so d[ — d 2 ~ 0. Therefore, Equations (14.6) give us 

X' • («! + « 2 ) = 0 on the ellipse, X' • («i — « 2 ) = 0 on the hyperbola. 

Let T = X'j || X’ || be a unit vector having the same direction as X' . Then T is tangent to the 
conic, and we have 

T ■ u 2 = —T • u x on the ellipse, T • u 2 = T • u x on the hyperbola. 

If 0j and 0 2 denote, respectively, the angles that T makes with jq and u 2 , where 0 < 6 X < 77 
and 0 < 0 2 < 77, these last two equations show that 

COS 0 2 = — COS on the ellipse, COS 6 2 = COS 6 X on the hyperbola. 

Hence we have 0 2 = 77 — 0 X on the ellipse, and d 2 = 6 X on the hyperbola. These relations 
between the angles 0j and 0 2 give the reflection properties of the ellipse and hyperbola. 

14.6 Applications to curvilinear motion. Velocity, speed, and acceleration 

Suppose a particle moves in 2-space or in 3-space in such a way that its position at time 
t relative to some coordinate system is given by a vector X(t). As t varies through a time 
interval, the path traced out by the particle is simply the graph of X. Thus, the vector- 
valued function X serves as a natural mathematical model to describe the motion. We call 
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X the position junction of the motion. Physical concepts such as velocity, speed, and 
acceleration can be defined in terms of derivatives of the position function. 

In the following discussion we assume that the position function may be differentiated 
as often as is necessary without saying so each time. 


definition. Consider a motion described by a vector-valued function X. The derivative 
X'(t) is called the velocity vector at time t. The length of the velocity vector, || X(t) II, is 
called the speed. The second derivative of the position vector, X’(t), is called the acceleration 
vector. 


Notation. Sometimes the position function X is denoted by Y, the velocity vector by v, 
the speed by v, and the acceleration by a. Thus, v = r' , v = ||»||, and a = v' = r”. 


If the velocity vector X’(t) is visualized as a geometric vector attached to the curve at 
X(t), we see that it lies along the tangent line. The use of the word “speed” for the length 
of the velocity vector will be justified in Section 14.12 where it is shown that the speed is the 
rate of change of arc length along the curve. This is what the speedometer of an auto- 
mobile tries to measure. Thus, the length of the velocity vector tells us how fast the par- 
ticle is moving at every instant, and its direction tells us which way it is going. The 
velocity will change if we alter either the speed or the direction of the motion (or both). The 
acceleration vector is a measure of this change. Acceleration causes the effect one feels 
when an automobile changes its speed or its direction. Unlike the velocity vector, the 
acceleration vector does not necessarily lie along the tangent line. 


example 1. Linear motion. Consider a motion whose position vector is given by 

r(t) = P +f(t)A , 

where P and A are fixed vectors, A # 0. This motion takes place along a line through 
P parallel to A. The velocity, speed, and acceleration are given by 

v(t)=f'(t)A, v(t)=\\v(t)\\=\fV)\ Mil, a(t) = f "(t)A . 

If f’(t) and f”(t) are nonzero, the acceleration vector is parallel to the velocity. 

example 2. Circular motion. If a point (x, y) in V 2 is represented by its polar coordinates 
r and 6, we have 

x - r cos 6 , y = r sin Q . 

If r is fixed, say r = a, and if 0 is allowed to vary over any interval of length at least 2tt, 
the corresponding point (x, y) traces out a circle of radius a and center at the origin. If 
we make 0 a function of time t, say 0 = f(t), we have a motion given by the position function 

r(t) = a cos f(t)i + a sin f{t)j . 

The corresponding velocity vector is given by 

K0 = r’(t) = ~ a f ‘(t) sin f(t)i + af(t) cos f(t)j , 
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from which we find that the speed at time t is 

v(t) =\ v(t)\ = a | f'(t)\ i 

The factor \f'(t)\ = \d®!dt\ is called the angular speed of the particle. 

An important special case occurs when 6 = cot, where co (omega) is a positive constant. 
In this case, the particle starts at the point (a, 0) at time t = 0 and moves counter-clockwise 
around the circle with constant angular speed (0. The formulas for the position, velocity, 
and speed become 

r(t) = a cos oil i + a sin o>t j , i ft) = —(oa sin cot i -f coa COS cot j , v(r) = aw . 

The acceleration vector is given by 

U(t) = — co 2 a cos cot i — co~a sin cot j = — co 2 r(t) , 

which shows that the acceleration is always directed opposite to the position vector. When 
it is visualized as a geometric vector drawn at the location of the particle, the acceleration 
vector is directed toward the center of the circle. Because of this, the acceleration is called 
centripetal or “center-seeking,” a term originally proposed by Newton. 

Note: If a moving particle has mass m , Newton’s second law of motion states that the 

force acting on it (due to its acceleration) is the vector ma(t), mass times acceleration. If 
the particle moves on a circle with constant angular speed, this is called a centripetal force 
because it is directed toward tlhe center. This force is exerted by the mechanism that 
confines the particle to a circular orbit. The mechanism is a string in the case of a stone 
whirling in a slingshot, or gravitational attraction in the case of a satellite around the 
earth. The equal and opposite reaction (due to Newton’s third law), that is, the force 
-ma(t), is said to be centrifugal or “tenter-fleeing.” 

example 3. Motion on un ellipse. Figure 14.5 shows an ellipse with Cartesian equation 
x 2 ja 2 + y 2 jb 2 = 1, and two concentric circles with radii a and b. The angle d shown in 
the figure is called the eccentric angle. It is related to the coordinates (x, y) of a point on the 
ellipse by the equations 


X := a COS 6 , y = b sin o . 

As 6 varies over an interval of length 2 t r, the corresponding point (x, §) traces out the 
ellipse. If we make Q a function of time t, say 6 = f(t). we have a motion given by the 
position function 


r(t) = a cos f(t)i + b sin f(t)j . 

If 0 = <X>t, where to is a positive constant, the velocity, speed, and acceleration are given by 
v(t) = to(—a sin cot i + b cos cot j ) , v(t) = w(a 2 sin 2 tot + b 2 cos 2 &t) m , 
a(t ) = — co 2 (a cos cot i + b sin cot j ) = — co 2 r(t ) . 
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Figure 14.5 Motion on an ellipse. 


z 



Figure 14.6 Motion on a helix. 


Thus, when a particle moves on an ellipse in such a way that its eccentric angle changes at a 
constant rate, the acceleration is centripetal. 

example 4. Motion on a helix. If a point (x, y, z) revolves around the z-axis at a constant 
distance a from it and simultaneously moves parallel to the z-axis in such a way that its 
z-component is proportional to the angle of revolution, the resulting path is called a 
circular helix. An example is shown in Figure 14.6. If 6 denotes the angle of revolution, 
we have 

(14.7) x = a COS 6 , y = a sin 0 , z = bO , 

where a > 0, and b ^6 0. When 0 varies from 0 to 2 77 . the x- and (’-coordinates return to 
their original values while z changes from 0 to 277 b. The number 2 77 /) is often referred to 
as the pitch of the helix. 

Now suppose that 8 = cot, where ft) is constant. The motion on the helix is then de- 
scribed by the position vector 

r(t) = a cos cot i + a sin tut j + bootk . 


The corresponding velocity and acceleration vectors are given by 

v(t) = —coa sin o)t i + coa cos cot j + bcok , a ( t ) = — m 2 (a cos cot i + a sin 0 Jt j) • 

Thus, when the acceleration vector is located on the helix, it is parallel to the xy-plane and 
directed toward the z-axis. 

If we eliminate 6 from the first two equations in (14.7), we obtain the Cartesian equation 
X 2 + y 2 = a 2 which we recognize as the equation of a circle in the xy-plane. In 3-Space, 
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however, this equation represents a surface. A point (x, y, z) satisfies the equation if and 
only if its distance from the z-axis is equal to a. The set of all such points is a right circular 
cylinder of radius a with its axis along the z-axis. The helix winds around this cylinder. 


14.7 Exercises 

In each of Exercises 1 through 6, r(t) denotes the position vector at time t for a particle moving 

on a space Curve. In each case, determine the velocity v(t) and acceleration a(t) in terms of i,j,k; 

also, compute the speed v(t). 

1. r(t) = (3 1 — t 3 )i + 3 t 2 j + (3t + t 3 )k. 4. r(t) = (t — sin t)i + (1 — cos t)j + 4 sin - k. 

2. r(t) = cos t i + sin t j + e t k. 5. r(t) = 3 t 2 i + 2v‘j + 3tk. 

3. r(t) = 3t COS 1 1 + 3t sin t j + 4 tk. 6. r(t) = t i + sin tj + (1 — cos t)k. 

7. Consider the helix described by the vector equation r(t) = a COS cot i + a sin cot j -f- bwtk, 

where to is a positive constant. Prove that the tangent line makes a constant angle with the 
z-axis and that the cosine of this angle is bj\/ a 2 + b 2 . 

8. Referring to the helix in Exercise 7, prove that the velocity v and acceleration a are vectors of 
constant length, and that 

||t> x a\\ a 

Ill’ll 3 o a -tfh 2 ' 

9. Referring to Exercise 7, let uit) denote the unit vector u{t) = sin cot i — cos otj. Prove that 
there are two constants A and B such that v x a = Au(t) + Bk, and express A and B in 
terms of a, b, and CO, 

10. Prove that for any motion the dot product of the velocity and acceleration is half the derivative 
of the square of the speed : 

v(t) < aft) = v-(t ) , 

11. Let c be a fixed unit vector. A particle moves in space in such a way that its position vector 
r(t) satisfies the equation r(t) . c = e 2t for all /, and its velocity vector v(t) makes a constant 
angle f) with c, where 0 < 8 < 1 , tt. 

(a) Prove that the speed at time t is 2e 2( /cos 8. 

(b) Compute the dot product a(f) ■ »(t) in terms of t and 0. 

12. The identity cosh 2 6 — sinh 2 0=1 for hyperbolic functions suggests that the hyperbola 

x 2 /a 2 y 2 jb 2 = 1 may be represented by the parametric equations x = a cosh 0, y = b sinh 6, 

or what amounts to the same thing, by the vector equation r = a cosh 0 i -j- 0 sinh 0 j . When 
a = b = 1, the parameter 6 may be given a geometric interpretation analogous to that which 
holds between 0, sin 8, and cos 0 in the unit circle shown in Figure 14.7(a). Figure 14.7(b) 
shows one branch of the hyperbola x 2 — y 2 = 1. If the point P has coordinates x = cosh 0 
and y = sinh 0, prove that 6 equals twice the area of the sector OAP shaded in the figure. 

[Hint: Let A(0) denote the area of sector OAP. Show that 

f cosh 0 

A(8) = cosh 0 sinh 0 — V x 2 — 1 dx , 

Differentiate to get A’(0) = |.] 

13. A particle moves along a hyperbola according to the equation r(t) = acosh coti + b sinh cot j, 
where co is a constant. Prove that the acceleration is centrifugal. 
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(a) Circle : x 2 + y 2 = 1 (b) Hyperbola: x~ — )' 2 = 1 

Figure 14.7 Analogy between parameter for a circle and that for a hyperbola. 


14. Prove that the tangent line at a point X of a parabola bisects the angle between the line 
joining X to the focus and the line through X parallel to the axis. This gives the reflection 
property of the parabola. (See Figure 14.3.) 

15. A particle of mass 1 moves in a plane according to the equation r(t) = x(t)i + )'(f)j. It is 
attracted toward the origin by a force whose magnitude is four times its distance from the 
origin. At time t = 0, the initial position is r(0) = 4i and the initial velocity is v(0)= 6/ 

(a) Determine the components x(t) and y(t) explicitly in terms of t. 

(b) The path of the particle is a conic section. Find a Cartesian equation for this conic, 
sketch the conic, and indicate the direction of motion along the curve. 

16. A particle moves along the parabola x 2 + c{y — x) = 0 in such a way that the horizontal 
and vertical components of the acceleration vector are equal. If it takes T units of time to 
go from the point (c, 0) to the point (0,0), how much time will it require to go from (c, 0) 
to the halfway point (c/2, c/4)? 

17. Suppose a curve C is described by two equivalent functions X and Y, where Y(t) = A[i,(/j], 
Prove that at each point of C the velocity vectors associated with X and Y are parallel, but 
that the corresponding acceleration vectors need not be parallel. 


14.8 The unit tangent, the principal normal, and the osculating plane of a curve 

For linear motion the acceleration vector is parallel to the velocity vector. For circular 
motion with constant angular speed, the acceleration vector is perpendicular to the velocity. 
In this section we show that for a general motion the acceleration vector is a sum of two 
perpendicular vectors, one parallel to the velocity and one perpendicular to the velocity. 
If the motion is not linear, these two perpendicular vectors determine a plane through each 
point of the curve called the osculating plane. 

To study these concepts, we introduce the unit tangent vector T. This is another vector- 
valued function associated with the curve, and it is defined by the equation 


T(t) = 


XV) 

|^'( 0 » 


whenever the speed \\X'(t) || ^ 0. Note that | 7’(/)|| = 1 for all t. 
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Figure 14.8 shows the position of the unit tangent geometric vector T(t) for various 
values of t when it is attached ta the curve. As the particle moves along the curve, the 
corresponding vector T, being of constant length, can change only in its direction. The 
tendency of T to change its direction is measured by its derivative 7". Since T has constant 
length, Theorem 14.2 tells us that T is perpendicular to its derivative T’. 



x 


Figure 14.8 The unit tangent vector T. Figure 14.9 The osculating plane. 

If the motion is linear, then 7” =: 0. If 7” ^ 0, the unit vector having the same direction 
as T' is called the principal normal to the curve and it is denoted by N. Thus, N is a new 
vector-valued function associated with the curve and it is defined by the equation 

T'(t) 

N(t ) = — , whenever II T'(t)|| ^ 0. 

II T'(0ll 

When the two unit geometric vectors T(t) and N(t) are attached to the curve at the point 
X(t), they determine a plane known as the osculating plane of the curve. If we choose three 
values of t, say /j , / 2 * and t 3 , and consider the plane determined by the three points X(t x ), 
X(t 2 ), X(t 3 ), it can be shown that the position of the plane approaches the position of the 
osculating plane at X(t x ) as t 2 and ? 3 approach t x . Because of this, the osculating plane is 
often called the plane that best fits the curve at each of its points. If the curve itself is a 
plane curve (not a straight line), the osculating plane coincides with the plane of the curve. 
In general, however, the osculating plane changes with t. Examples are illustrated in 
Figure 14.9. 

The next theorem shows that the acceleration vector is a sum of two vectors, one parallel 
to T and one parallel to 7”. 
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theorem 14.9. For a motion described by a vector-valuedfunction r, let v(t) denote the 
speed at time t, v(t) = ||j > '(fJ||. Then the acceleration vector a is a linear combination of T 
and T' given by the formula 

(14.8) a(t) = v'(t)T(t) + v(t)T’(t). 

I f T’(t) 7 ^ 0, we also have 

(14.9) a(t) = v'{t)T{t) + v(f) | r'(0 1| N(t) . 

Proof. The formula defining the unit tangent gives us 

u(t) = v(t)T(tj. 

Differentiating this product, we find that 

a(t) = v'(t)T(t) + v{t)T\t), 

which proves (14.8). To prove (14.9), we use the definition of N to write T’(t) = 

iino nno. 

This theorem shows that the acceleration vector always lies in the osculating plane. An 
example is shown in Figure 14.10. The coefficients of T(t) and N(t) in (14.9) are called, 
respectively, the tangential and normal components of the acceleration. A change in speed 
contributes to the tangential component, whereas a change in direction contributes to the 
normal component . 

For a plane curve, the length of T’(t) has an interesting geometric interpretation. Since 
T is a unit vector, we may write 


T(t) = cos a (t)i + sin a (/)/, 



x 


Figure 14.10 The acceleration vector lies 
in the osculating plane 


Figure 14.11 The angle of inclination of the 
tangent vector of a plane curve. 
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where aft) denotes the angle between the tangent vector and the positive x-axis, as shown 
in Figure 14.11. Differentiating, we find that 


T'(t) = —sin a(/) a \t)i + cos a (t) a(t)j = a \t)u{t) , 

where u(t) is a unit vector. Therefore || T'(t)\\ = | <x.'(t)\ and this shows that || T'(t)\\ is a 
measure of the rate of change of the angle of inclination of the tangent vector. When 
u’(t) > 0, the angle is increasing,, and hence u(t) = N{t). When a'(f) < 0, the angle is 
decreasing and, in this case, u(t ) = —N(t). The two cases are illustrated in Figure 14.11. 
Note that the angle of inclination, of u(t) is a (t) + \tt since we have 


n(t) = —sin a(t)i + cos a{t)j = cos 




14.9 Exercises 

Exercises 1 through 6 below refer to the motions described in Exercises 1 through 6, respectively, 
of Section 14.7. For the value of t specified, (a) express the unit tangent Tand the principal normal 
N in terms of /, j, k; (b) express the: acceleration a as a linear combination of T and N. 

1. t=2. 3.1=0. 5. / = 1. 

2. t = IT, 4. t = TT, 6. t = ^7 T, 

7. Prove that if the acceleration vector is always zero, the motion is linear. 

8. Prove that the normal component of the acceleration vector is |(p x «[|/||r||. 

9. For each of the following statements about a curve traced out by a particle moving in 3-space, 
either give a proof or exhibit a counter example. 

(a) If the velocity is constant, the curve lies in a plane. 

(b) If the speed is constant, the curve lies in a plane. 

(c) If the acceleration is constant, the curve lies in a plane. 

(d) If the velocity is perpendicular to the acceleration, the curve lies in a plane. 

10. A particle of unit mass with position vector r(t) at time t is moving in space under the actions 
of certain forces. 

(a) Prove that rxa = 0 implies r x i = c, where c is a constant vector. 

(b) If rx v = c, where c is a constant vector, prove that the motion takes place in a plane. 
Consider both c ¥= 0 and c = 0. 

(c) If the net force acting on the particle is always directed toward the origin, prove that the 
particle moves in a plane. 

(d) Isrx u necessarily constant if a particle moves in a plane? 

11. A particle moves along a curve in such a way that the velocity vector makes a constant angle 

with a given unit vector c. 

(a) If the curve lies in a plane containing c, prove that the acceleration vector is either zero 
or parallel to the velocity. 

(b) Give an example of such a curve (not a plane curve) for which the acceleration vector is 
never zero nor parallel to the velocity. 

12. A particle moves along the ellipse lx 1 + y 2 = I with position vector r(t) = f(t)i + g{t)j. 
The motion is such that the horizontal component of the velocity vector at time t is —g(t). 

(a) Does the particle move around the ellipse in a clockwise or counterclockwise direction? 

(b) Prove that the vertical component of the velocity vector at time t is proportional to f(t) 
and find the factor of proportionality. 

(c) How much time is required for the particle to go once around the ellipse? 

13. A plane curve C in the first quadrant has a negative slope at each of its points and passes 
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through the point (§, 1). The position vector r from the origin to any point (x, y) on C 
makes an angle 6 with i, and the velocity vector makes an angle </> with i, where 0 < 0 < \tt, 
and 0 < <j> < \tt. If 3 tan <f> = 4 cot 0 at each point of C, find a Cartesian equation for C 
and sketch the curve. 

14. A line perpendicular to the tangent line of a plane curve is called a normal line. If the normal 
line and a vertical line are drawn at any point of a certain plane curve C, they cut off a segment 
of length 2 on the x-axis. Find a Cartesian equation for this curve if it passes through the 
point (1 ; 2). Two solutions are possible. 

15 . Given two fixed nonzero vectors A and B making an angle 0 with each other, where 0 < 0 < r. 
A motion with position vector r(t) at time t satisfies the differential equation 

r’(t) = A x r(t) 


and the initial condition r(0) = B. 

(a) Prove that the acceleration a(t) is orthogonal to A. 

(b) Prove that the speed is constant and compute this speed in terms of A, B, and 0. 

(c) Make a sketch of the curve, showing its relation to the vectors A and B. 

16. This exercise describes how the unit tangent and the principal normal are affected by a change 
of parameter. Suppose a curve C is described by two equivalent functions X and Y, where 
Y(t) = X[u(t)]. Denote the unit tangent for X by 7\- and that for Y by T v ■ 

(a) Prove that at each point of C we have T v (t) = T x [u{t)} if w is strictly increasing, but that 
T y (t) = — 7\-[«(/)]if u is strictly decreasing. In the first case, u is said to preserve orientation; 
in the second case, u is said to reverse orientation. 

(b) Prove that the corresponding principal normal vectors N y and N Y satisfy N Y {t) = 
A v [«(7)] at each point of C. Deduce that the osculating plane is invariant under a change of 
parameter. 


14.10 The definition of arc length 

Various parts of calculus and analytic geometry refer to the arc length of a curve. Before 
we can study the properties of the length of a curve we must agree on a definition of arc 
length. The purpose of this section is to formulate such a definition. This will lead, in a 
natural way, to the construction of a function (called the arc-length function) which 
measures the length of the path traced out by a moving particle at every instant of its 
motion. Some of the basic properties of this function are discussed in Section 14.12. In 
particular, we shall prove that for most curves that arise in practice this function may be 
expressed as the integral of the speed. 

To arrive at a definition of what we mean by the length of a curve, we proceed as though 
we had to measure this length with a straight yardstick. First, we mark off a number of 
points on the curve which we use as vertices of an inscribed polygon. (An example is 
shown in Figure 14.12.) Then, we measure the total length of this polygon with our yard- 
stick and consider this as an approximation to the length of the curve. We soon observe 
that some polygons “fit” the curve better than others. In particular, if we start with a 
polygon Pi , and construct a new inscribed polygon P-i by adding more vertices to those of P u 
it is clear that the length of P,, will be larger than that of Pi , as suggested in Figure 14.13. 

In the same way we can form more and more polygons with successively larger and larger 
lengths. 

On the other hand, our intuition tells us that the length of any inscribed polygon should 
not exceed that of the curve (since a straight line is the shortest path between two points). 
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In other words, when we arrive at a definition for the length of a curve, it should be a 
number which is an upper bound to the lengths of all inscribed polygons. Therefore, it 
certainly seems reasonable to define the length of the curve to be the least upper bound 
of the lengths of all possible inscribed polygons. 

For most curves that arise in practice, this definition gives us a useful and reasonable 
way to assign a length to a curve. Surprisingly enough, however, there are certain patho- 
logical cases where this definition is not applicable. There are curves for which there is 
no upper bound to the lengths of the inscribed polygons. (An example is given in Exercise 



figure 14.12 A curve with an in- 
scribed polygon. 


A 



Figure 14.13 The polygon ABC has a 
length greater than the polygon AC. 


22 in Section 14.13.) Therefore it becomes necessary to classify all curves into two cate- 
gories: those which have a length, and those which do not. The former are called rectifiable 
curves, the latter, nonrectifiable. 

To formulate these ideas in analytic terms, we begin with a curve in 3-space or in 2-space 
described by a vector-valued function Y, and we consider the portion of the curve traced 
out by u(t) as / varies over an interval [a, b ]. At the outset, we only assume that Y is contin- 
uous on the parametric interval. Later we shall add further restrictions. 

Consider now any partition P of the interval [a, b], say 

P = {?„ ■ E,... * t„}, where a = t Q < <t„ = b . 


Denote by tt(P) the polygon whose vertices are the points »*(/„), rff ), . . . , r(t n ), respectively. 
(An example with n = 6 is shown in Figure 14.14.) The sides of this polygon have lengths 

lk(k) - '•(to) 1 1 , Ik(fis) - '■(k) II, ... , lk(f„) - r(t n -i) II . 

Therefore, the length of the polygon n(P), which we denote by ]7 t(P)|, is the sum 


n 


KJkl = Ilk(k)-K<*-i)ll- 

k=l 


DEFINITION. If there exists a positive number M such that 


(14.10) 


HP)\ < M 
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for dll partitions P of [a, b], then the curve is said to be rectifiable and its arc length, denoted 
by A (a, b), is dejined to be the least upper bound of the set of all numbers |7r(.P)|, I f there is 
no such M, the curve is called nonrectifiable. 

Note that if an M exists satisfying (14.10), then, for every partition P, we have 
(14.11) \tt(P)\ < A(a, b) < M , 

since the least upper bound cannot exceed any upper bound. 



x 


figure 14.14 A partition of [a, bl into six subintervals and the corresponding 

inscribed polygon. 

It is easy to prove that a curve is rectifiable whenever its velocity vector v is continuous 
on the parametric interval [a, b]. In fact, the following theorem tells us that in this case we 
may use the integral of the speed as an upper bound for all numbers |7f(P)|. 

theorem 14.10. Denote by aft) the velocity vector of the curve with position vector r(t) 

and let v(t) = ||t>(l)|] denote the speed. If v is continuous on [a, b], the curve is rectifiable and 
its length A {a, b) satisfies the inequality 

(14.12) A(fl, b) < f v(t) dt . 

M 

Proof. For each partition P of [a, b], we have 

WP) I = 2 II Kh) ~ '•(bfc-i)ll = 2 f '*'(0 dt 

1 k = 1 Jh - 1 

n rh n tie (*b 

= 2 v( d) dt ^ 2 KOIMi = v(t)dt, 

k = 1 J tk - 1 k~l J Uc-i Ja 
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the inequality being a consequence of Theorem 14.8. This shows that we have 

mp)i < f m dt 

for all partitions P, and hence the number j'-' v(t) dt is an upper bound for the set of all 
numbers |7r(P)|. This proves that the curve' is rectifiable and, at the same time, it tells us 
that the length A(fl, b) cannot exceed the integral of the speed. 

In a later section we shall prove that the inequality in (14.12) is, in fact, an equality. 
The proof of this fact will make use of the additivity of arc length, a property described in 
the next section. 

14. 1 1 Additivity of arc length 

If a rectifiable curve is cut into two pieces, the length of the whole curve is the sum of the 
lengths of the two parts. This is another of those “intuitively obvious” statements whose 
proof is not trivial. This property is called additivity of arc length and it maybe expressed 
analytically as follows. 

theorem 14.11. Consider a rectifiable curve of length A(a, b) traced out by a vector 
r(t) as t varies over an interval [a, b]. If a <A < b, let C\ and C 2 be the curves traced out by 
r(t) as t varies over the intervals [a, c] and [c, b], respectively. Then C\ and C 2 are also rectifiable 
and, if A(a, c) and A (c, b) denote their respective lengths, we have 

A (a,b) = A{a,c)+ A{c,b) . 

Proof. Let P x and P 2 be arbitrary partitions of [a, c] and [c, b], respectively. The points 
in P t taken together with those in P,, give us a new partition P of [a, b] for which we have 

(14.13) k(^)| + \tt{P 2 )\ = KP)| < A (a, b) . 

This shows that |7r(P 1 )| an d |7r(-P 2 )l are bounded by A {a, b), and hence C 1 and C 2 are 
rectifiable. From (14.13), we also have 

HA)I < A(fl, b) - \rr(P 2 ) | . 

Now, keep P 2 fixed and let P, varv over all possible partitions of [a, c]. Since the number 
A {a, b) — | 77 -(Po)| is an upper bound for all numbers ^(P^l, it cannot be less than their 
least upper bound, which is A (a, c). Hence, we have A (a, c) < A (a, b) — |7t(P 2 )[ or, what 
is the same thing, 

< A {a, b ) - A (a, c) . 

This shows that A (a, b) — A (a, c) is an upper bound for all the sums |77(P 2 )I> and since it 
cannot be less than their least upper bound, A(c, b), we have A(c, b) < A (a, b) — A(fif, c). 
In other words, we have 


(14.14) 


A (a, c) + A(c, b) < Na, b) . 
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Next we prove the reverse inequality. We begin with any partition P of [a, b]. If we 
adjoin the point c to P, we obtain a partition P 1 of [a, c] and a partition P 2 of [c, b] such that 

HP) I < HA) I + HP*) I < A (a, c ) + Me, b ) . 

This shows that A (a, c) + A(c, b) is an upper bound for all numbers |77-(.P)|. Since this 
cannot be less than the least upper bound, we must have 

A (a, b) < A (a, c ) + A(c, b) , 

This inequality, along with ( 14 . 14 ), implies the additive property. 


14.12 The arc-length function 

Suppose a curve is the path traced Out by a position vector r(t). A natural question to ask 
is this: How far has the particle moved along the curve at time f? To discuss this question, 
we introduce the arc-length function s, defined as follows: 

s(t) = A {a, t ) if t > a, s(a) = 0. 

The statement s(a) =0 simply means we are assuming the motion begins when t = a. 

The theorem on additivity enables us to derive some important properties of s. For 
example, we have the following. 

THEOREM 14.12. For any rectifiable curve, the arc-length junction s is monotonically 
increasing on [a, b]. That is, we have 

(14.15) s(tf < s(t 2 ) if a < f < to < b . 

Proof. If a <t 1 < t 2 < b, we have 

sUi) - s{f) = A (a, t 2 ) - A (a, f) - A(/ x , t 2 ) , 

where the last equality cornes from additivity. Since A(fj , tf) > 0, this proves (14.15). 

Next we shall prove that the function s has a derivative at each interior point of the 
parametric interval and that this derivative is equal to the speed of the particle. 

THEOREM 14.13. Let s denote tile arc-length function associated with a curve and let 
v(t) denote the speed at time t. If v is continuous on [a, b], then the derivative s’(t) existsfor 
each t in (a, b) and is given by the formula 

(14.16) s'(t) = v{t). 

Proof. Define fit) = j'j v(u) du. We know that f’(t) = v(t) because of the first funda- 
mental theorem of calculus. We shall prove that s’(t) = v(t). For this purpose we form the 
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difference quotient 
(14.17) 


r{t + h) - r(t ) 
h 


Suppose first that h > 0. The line segment joining the points r(t) and r(t + h) may be 
thought of as a polygon approximating the arc joining these two points. Therefore, 
because of (14.1 1), we have 


||r(t + h) - r(t) II < A(f, t + h)= s(t + h) — s(t) . 

Using this in (14.17) along with the inequality (14.12) of Theorem 14.10 we have 
I r(t + /;) - r(t) 


h 


< s{t + h) - s(t) J_ 

” h ~ h 


i h 


A similar argument shows that these inequalities are also valid for h < 0. If we let h — ► 0, 
the difference quotient on the extreme left approaches ||r'(0l! = v (0 and that on the 
extreme light approaches f'(t) ~ v(t). It follows that the quotient [s(Z + h) — s(t)]jh 
also approaches v(t). But this means that s'(t) exists and equals v(t), as asserted. 


Theorem 14.13 conforms with our intuitive notion of speed as the distance per unit time 
being covered during the motion. 

Using (14.16) along with the second fundamental theorem of calculus, we can compute 
arc length by integrating the speed. Thus, the distance traveled by a particle during a time 
interval [f , / 2 ] is 

s(t 2 ) - s^) = f V(t) dt =/ v(t) dt. 

In particular, when t x = a and t 2 = b, we obtain the following integral for arc length: 

a, b) — [ v(t) dt , 

example 1. Length of a circular arc. To compute the length of an arc of a circle of 
radius a, we may imagine a particle moving along the circle according to the equation 
r(t) = a cos t i + a sin t j. The velocity vector is v(t) = -a sin t i + a cos t j and the 
speed is v(t) = a. Integrating the rpeed over an interval of length 6, we find that the length 
of arc traced out is ad. In other words, the length of a circular arc is proportional to the 
angle it subtends; the constant of proportionality is the radius of the circle. For a unit 
circle we have a = 1, and the arc length is exactly equal to the angular measure. 


example 2. Length of the graph of a real-valued function. The graph of a real-valued 
function f defined on an interval [a, b ] can be treated as a curve with position vector r(t) 
given by 

r(t) = ti +f(t)j. 

The corresponding velocity vector is v(t) = i + f '(t)j, and the speed is 


v(t) = \\v(t) || = Vl + [/W • 
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Therefore, the arc length of the graph off above a subinterval [a, x] is given by the integral 
(14.18) Mx) = 1 * 0(0 dt = J o Vi + [f'(t)f dt . 

14.13 Exercises 

In Exercises 1 through 9, find the length of the path traced Out by a particle moving on a curve 
according to the given equation during the time interval specified in each case. 

1. r(t) = n(l — cos t)i + a(t — sin t)j, 0 < t < 2-n, a > 0. 

2. r(t) = e* cos t i + e l sin tj, 0 < t < 2. 

3. r(t) = o( cos t -f t sin t)i + n(sin t — t COS t)j, 0 < / < 2^, a > 0. 

2 2 

4. r(t) = — cos 3 1 i + — sin 3 tj, 0 < t < 2 it, c 2 = a 2 — b 2 , 0 <b < a. 

a b 

5. r(t) = ufsinh t — t)i + a(cosh t — ])j, 0 < t < T, a > 0. 

6. r(t) = sin t i + tj + (1 — cos t)k (0 < t < 2x). 

7. r(t) = t i + 3 fij + 6 t 3 k (0 < t < 2). 

8. r(t) = t i + log (sec t)j + log (sec t + tan t)k (0 < t < Jtt). 

9. r(t) = a cos wt i + a sin mt j + bwk {t Q < t < tj. 

10. Find an integral similar to that in (14.18) for the length of the graph of an equation of the 
form x = g(y), where g has a continuous derivative on an interval [c, d], 

1 1 . A curve has the equation y 2 = x“. Find the length of the arc joining (1,-1) to (1, 1). 

12. Two points A and fiona unit circle with center at 0 determine a circular sector AOB. Prove 
that the arc AB has a length equal to twice the area of the sector. 

13. Set up integrals for the lengths of the curves whose equations are (a) y = e x , 0 <x < 1; 
(b) x = t + log t, y = t — log /, 1 < t <, e. Show that the second length is \'2 times the 
first one. 

14. (a) Set up the integral which gives the length of the curve y = c COsh(x/c) from x = 0 to 
x = a (a > 0, c > 0). 

(b) Show that c times the length of this curve is equal to the area of the region bounded by 
y = c cosh (rr/c), the x-axis, the y-axis, and the line x = a. 

(c) Evaluate this integral and find the length of the curve when a = 2. 

15. Show that the length of the curve y = COSh x joining the points (0, 1) and (x, cosh x) is 
sinh x if x > 0. 

16. A nonnegative function f has the property that its ordinate set over an arbitrary interval has 
an area proportional to the arc length of the graph above the interval. Find j, 

17. Use the vector equation r(t) = a sin t i + b cos t j, where 0 < b < a, to show that the cir- 
cumference L of an ellipse is given by the integral 

L = 4a f 1 yf 1 — e 2 sin 2 1 dt . 

Jo 

where e = \/ a 2 — b 2 ja. (The number e is the eccentricity of the ellipse.) This is a special case 
of an integral of the form 


E(k) = f W Vl - k 2 sin 2 1 dt , 

do 

called an elliptic integral of the second kind, where 0 ^ k < 1 . The numbers E(k) have been 
tabulated for various values of k. 
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18. If 0 < b < 4 a, let r(f) = a{t — sin t)i + a(l — cos t)j + b sin k. Show that the length of 
the path traced out from t = 0 to / = 277 is 8 aE(k), where E(k) has the meaning given in 
Exercise 1 7 and k 2 = 1 ™ (b/4a) 2 . 

19. A particle moves with position vector 

r(t ) = tA + t 2 B + 2(| f ) 3/2 ** B 

where A and B are two fixed unit vectors making an angle of 77/3 radians with each other. 
Compute the speed of the particle at time t and find how long it takes for it to move a distance 
of 12 units of arc length from the initial position r(0). 

20. (a) When a circle rolls (without slipping) along a straight line, a point on the circumference 
traces out a curve called a cycloid. If the fixed line is the x-axis and if the tracing point (x, y) 
is originally at the origin, show that when the circle rolls through an angle 0 we have 

X = a(8 — sin 6) , y = a(l — COS 6 ) , 

where a is the radius of the circle, These serve as parametric equations tor the cycloid. 

(b) Referring to part (a), show that dyjdx = cot and deduce that the tangent line of the 
cycloid at (x, y) makes an angle \ (77 -■ 6) with the x-axis. Make a sketch and show that the 
tangent line passes through the highest point on the circle. 

21. Let C be a curve described by two equivalent functions X and Y, where Y(t) = X[u(t)] for 
c <C t < d. If the function u which defines the change of parameter has a continuous deriv- 
ative in [c, d] prove that 

II X'(U) l\du=f d \lY'(tndt, 

Ju\c) J, c 

and deduce that the arc length of C is invariant under such a change of parameter. 

22. Consider the plane curve whose vector equation is r(t) = ti + f(t)j, where 

/it) = t cos x- if t # 0 , /( 0) = 0 . 

0 

Consider the following partition of the interval [0,1]: 

jP = { 0; 2»’2n 1” ’ 3 ’ 2 ’ 1 

Show that the corresponding inscribed polygon 7 t(P) has length 

1 1 1 

K*)l >l+r+i + -'-+^- 

2 3 In 


and deduce that this curve is nonrectifiable. 


14.14 Curvature of a curve 

For a straight line the unit tangent vector T does not change its direction, and hence 
T = 0. If the curve is not a straight line, the derivative T' measures the tendency of the 
tangent to change its direction. The rate of change of the unit tangent with respect to arc 
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length is called the curvature vector of the curve. We denote this by dT/ds, where s repre- 
sents arc length. The chain rule, used in conjunction with the relation s’(t) = v(t), tells us 
that the curvature vector dT/ds is related to the “time” derivative T’ by the equation 


41 _ * 41 _ _ 1 _ 

ds ~ ds dt ~ s'(t ) y ’ 


v(t) 


TV). 


Since T’(t) = II T’(t) j| N(t), we obtain 


(14.19) 


dT 

ds 


II r (on 

v(t) 


N(t) 


J 


which shows that the curvature vector has the same direction as the principal normal N(t). 
The scalar factor which multiplies N(t ) in (14.19) is a nonnegative number called the 
curvature of the curve at t and it is denoted by K(t) (* is the Greek letter kappa). Thus the 
curvature K(t), defined to be the length of the curvature vector, is given by the following 
formula : 


(14.20) k(0 - " J , 

v(t) 

example 1. Curcature of a circle. For a circle of radius a, given by r(t) =a COS t i + 
a sin tj, we have v(t) = -a sin t i + a COS t j, v(t) = a, T(t) = — sin t i + COS tj, and 
T’(t) = —cos t i — sin t j. Hence we have |' T'(t)\\ = 1 SO k(/) = l/a. This shows that a 
circle has constant curvature. The reciprocal of the curvature is the radius of the circle. 

When x(t) ^ 0, its reciprocal is called the radius of curvature and is denoted by p(t) 

( p is the Greek letter rho). That circle in the osculating plane with radius p(t) and center 
at the tip of the curvature vector is called the osculating circle. It can be shown that the 
osculating circle is the limiting position of circles passing through three nearby points on 
the curve as two of the points approach the third. Because of this property, the osculating 
circle is often called the circle that “best fits the curve” at each of its points. 


EXAMPLE 2. Curvature of a plane curve. For a plane curve, we have seen that |j T’(t) |' = 
|a'(/)|, where u( t) is the angle the tangent vector makes with the positive x-axis, as shown 
in Figure 14.11. From the chain rule, we have a'(?) = da.jdt = (d<xlds)(ds/dt) = v{t)da.jds, 
SO Equation (14.20) implies 


In other words, the curvature of a plane curve is the absolute value of the rate of change of 
a with respect to arc length. It measures the change of direction per unit distance along the 
curve. 


example 3. Plane curves of constant curvature. If doc/ds is a nonzero constant, say 
da./ds = a, then oc = as + b, where b is a constant. Hence, if we use the arc length s as 
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a parameter, we have T = cos (as + b)i + sin (as + b)j. Integrating, we find that r = 
(1/a) sin (as + b)i — (1/a) COS (as + b)j + A, where A is a constant vector. Therefore 
||r — A || = l/|a|, so the curve is a circle (or an arc of a circle) with center at A and radius 
l/|a|. This proves that a plane curve of constant curvature k ^ 0 is a circle (or an arc of a 
circle) with radius 1 /k. 

Now we prove a theorem which relates the curvature to the velocity and acceleration. 


theorem 14.14. For any motion with velocity v(t), speed v(t), acceleration u(t), and 
curvature *(/), we have 

(14.21) u(t) = v'(t)T(t) + K(t)v\t)N{t) . 

This formula, in turn, implies 


(14.22) 


, , _ \\a(t) x v(t)\ \ 

{ j v\t) 


Proof. To prove (14.21), we rewrite (14.20) in the form |; T’(t) || = K{t)v(t), which gives us 
T’(t) = K(t)v(t)N(t). Substituting this expression for T’(t) in Equation (14.8), we obtain 
(14.21). 

To prove (14.22), we form the cross product u(t) x v(t), using (14.21) for u(t) and the 
formula v(t) = v(t)T(t ) for the velocity. This gives us 

(14.23) a x v - v'vT x T -{■ kv 3 N X T = kv 3 N X T 

since T x T = 0. If we take the length of each member of (14.23) and note that 

II N x 71= || TV || mi sin § 77 = 1, 

we obtain ||a x »f= kv 3 , which proves (14.22). 

In practice it is fairly easy to compute the vectors V and a (by differentiating the position 
vector y); hence Equation (14.22) provides a useful method for computing the curvature. 
This method is usually simpler than determining the curvature from its definition. 

For a straight line we have a x v = 0, SO the curvature is everywhere zero. A curve 
with a small curvature at a point has a large radius of curvature there and hence does not 
differ much from a straight line in the immediate vicinity of the point. Thus the curvature 
is a measure of the tendency of a curve to deviate from a straight line. 


14.15 Exercises 

1. Refer to the curves described in Exercises 1 through 6 of Section 14.9 and in each case determine 
the curvature k(i) for the value of t indicated. 

2. A helix is described by the position function r(t) = a cos o:t i + a sin wt j + b<otk. Prove that 
it has constant curvature k = aj(a 2 + b 2 ). 
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3. Two fixed unit vectors A and B make an angle 6 with each other, where 0 < 0 < r, A 
particle moves on a space curve in such a way that its position vector r(t) and velocity v(t) 
are related by the equation v(t ) = A x r(t). If r(0) = B, prove that the curve has constant 
curvature and compute this curvature in terms of 0. 

4. A point moves in space according to the vector equation 

r(t) =4 cos t i + 4 sin tj + 4 cos t k , 

(a) Show that the path is an ellipse and find a Cartesian equation for the plane containing 
this ellipse. 

(b) Show that the radius of curvature is p(t) = 2V2 (1 + sin 2 r) 3/2 . 

5. For the curve whose vector equation is r(t) = e*i +e~ t j-\- 's/it k, show that the curvature is 
x(t) = V2/(e f + e~ r f. 

6. (a) For a plane curve described by the equation r(t) = x(t)i + y(t)j, show that the curvature 
is given by the formula 

_ I x'(t)y"(0 - y'(t)x"(t)\ 

K{ ) " {[x\t)f + [ rm 3/2 ' 


(b) If a plane curve has the Cartesian equation y = f(x), show that the curvature at the point 
(x, f{x)) is 

\m 

(TTTfWFp' 


7. If a point moves so that the velocity and acceleration vectors always have constant lengths, 
prove that the curvature is constant at all points of the path. Express this constant in terms 
of || a || and |f||. 

8. If two plane curves with Cartesian equations y = f(x) and y = g(x) have the same tangent 
at a point (a, b) and the same curvature at that point, prove that \f"(a)\ = \g''{a)\ . 

9. For certain values of the constants a and b, the two curves with Cartesian equations y = 
ax(b — x) and (x + 2 )y = x intersect at only one point P, have a common tangent line at P, 
and have the same curvature at P. 

(a) Find all a and b which satisfy all these conditions. 

(b) For each possible choice of a and b satisfying the given conditions, make a sketch of the 
two curves. Show how they intersect at P. 

10. (a) Prove that the radius of curvature of a parabola is smallest at its vertex. 

(b) Given two fixed unit vectors A and B making an angle 6 with each other, where 0 <6 <n, 
The curve with position vector r(t) = tA + t 2 B is a parabola lying in the plane spanned by 
A and B. Determine (in terms of A, B, and d ) the position vector of the vertex of this parabola. 
You may use the property of the parabola stated in part (a). 

11. A particle moves along a plane curve with constant speed 5. It starts at the origin at time 
t = 0 with initial velocity 5j, and it never goes to the left of the y-axis. At every instant the 
curvature of the path is «(/) = 2t. Let cc(t) denote the angle that the velocity vector makes 
with the positive x-axis at time t. 

(a) Determine cc(t) explicitly as a function of t. 

(b) Determine the velocity v(t) in terms of i and j. 

12. A particle moves along a plane curve with constant speed 2. The motion starts at the origin 
when t = 0 and the initial velocity r(0) is 2i. At every instant it is known that the curvature 
x(t) = 4 1. Find the velocity when t = Ja/w if the curve never goes below the x-axis. 
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14.16 Velocity and acceleration in polar coordinates 

Sometimes it is more natural to describe the points on a plane curve by polar coordinates 
rather than rectangular coordinates. Since the rectangular coordinates (x, y) are related to 
the polar coordinates r and 8 by the equations 

x -.= r COS 6 , y = r sin 0 , 



Figure 14.15 Polar coordinates. Figure 14.16 The unit vectors u r and Uq. 


the position vector r = xi + yj joining the origin to (x, y) is given by 

r = r cos 8 i + r sin 8j = r( cos 8 i + sin 6 j ) , 

where r = ||r ||, This relation is illustrated in Figure 14.15. 

The vector cos 0 i + sin 6 j is a vector of unit length having the same direction as r. 
This unit vector is usually denoted by « r and the foregoing equation is written as follows: 

r = ru„ where u r = cos 0 i + sin dj . 


It is convenient to introduce also a unit vector u $ , perpendicular to » r , which is defined as 
follows : 

du r -a ■ n ■ 

u$ == — —sin 0 i + cos d j ■ 
dd = 


Note that we have 


due 

dd 


— COS 8 i — sin 8 j = —u f . 


In the study of plane curves, the two unit vectors u r and u e play the same roles in polar 
coordinates as the unit vectors i and j in rectangular coordinates. Figure 14.16 shows the 
unit vectors u t and u e attached to a curve at some of its points. 
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Now suppose the polar coordinates f and 6 are functions of t, say r = f(t), 0 = g(t). 
We shall derive formulas for expressing the velocity and acceleration in terms of U r and 
U e . For the position vector, we have 

r = ru, =f(t)u r . 

Since 6 depends on the parameter t, the same is true of the unit vector u r and we must take 
this into account when we compute the velocity vector. Thus we have 


dr d(ru r ) dr 


du T 


V dt dt dt 


= -»r+r-r 


(ft 


Using the chain rule, tve may express du r jdt in terms of u e by writing 

du r d8 du, d8 


(14.24) 


dt 


dt d6 dt 


ue. 


and the equation for the velocity vector becomes 


(14.25) 


dr dO 


The scalar factors drjdt and rddjdt multiplying u r and u e are called, respectively, the 

radial and transverse components of velocity. 

Since u r and u e are orthogonal unit vectors, we find that 


vv = ( j ] + ( r f!' 


SO the speed v is given by the formula 


{dr\ l 

= J T + U 
y \dt! \ 


dO 

dt 


Differentiating both sides of (14.25), we find that the acceleration vector is given by 


. d 2 r , dr du T . , 

« = 77 «r + 77 I + 
\dr dt dt 


( d 2 d dr 

r 77 “o + T 

\ dt 2 dt 


dd , dd du e . 

— ue + r ) . 

dt dt dt dt , 


The derivative dujdt may be expressed in terms of u g by (14.24). We may similarly express 
the derivative of u B by the equation 


dug dd dug d6 


dt dt dd 


dt 


“r • 
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This leads to the following formula which expresses a in terms of its radial and transverse 
components : 


(14.26) 



/ dO 

r\ — I l« r + 

dt 


d 2 0 , . drd6\ 

r — r + 2 I lift 

dt 2 dt dt/ 


When 0 = t, the curve may be described by the polar equation Y = f(e). In this case, the 
formulas for velocity, speed, and acceleration simplify considerably, and we obtain 


dr 

v = 7o Ur+ru " 


V := 


dOJ 


dfr 
dd 2 




14.17 Plane motion with radial acceleration 

The acceleration vector is said to be radial if the transverse component in Equation 

(14.26) is always zero. This component is equal to 

d 2 6 , „ dr dd 1 d [ g d0\ 

dr dt dt r dt\ dt' 

Therefore, the acceleration is radial if and only if r 2 ddjdt is constant. 

Plane motion with radial acceleration has an interesting geometric interpretation in 
terms of area. Denote by Aft) the area of the region swept out by the position vector from 
a fixed time, say t = a, to a later time t. An example is the shaded region shown in Figure 
14.17. We shall prove that the time rate of change of this area is exactly equal to \ r 2 ddjdt. 
That is, we have 

(14.27) AV) = 2<V 

From this it follows that the acceleration vector is radial if and only if the position vector 
sweeps out area at a constant rate. 

To prove (14.27) we assume that it is possible to eliminate t from the two equations 
r =f(t), 6 = £f( t ), and thereby express r as a function of 6, say r = R(Q). This means that 
there is a real-valued function R such that 7?[g(?)] = /(/ ). Then the shaded region in Figure 
14.17 is the radial set of R over the interval [g(a), g(t)]. By Theorem 2.6, the area of this 
region is given by the integral 

Differentiating this integral by the first fundamental theorem of calculus and the chain 
rule, we find that 

A’ ( t) = \ R 2 [g(0]g'(o = \ mm = f r 2 f t . 


which proves (14.27). 
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14.18 Cylindrical coordinates 

If the X- and y-coordinates of a point P = (x, y, z) in 3-space are replaced by polar 
coordinates r and 0, then the three numbers r, 0, z are called cylindrical coordinates for 
the point P. The nonnegative number r now represents the distance from the z-axis to 
the point P, as indicated in Figure 14.18. Those points in space for which r is constant are 
at a fixed distance from the z-axis and therefore lie on a circular cylinder (hence the name 
cylindrical coordinates). 



Figure 14.17 The position vector sweeps Figure 14.18 Cylindrical coordinates. 


out area at the rate A’(t) = - r 2 — . 

2 at 

To discuss space curves in cylindrical coordinates, the equation for the position vector 
r must be replaced by one of the form 

r - ru, + z(t)k . 

Corresponding formulas for the velocity and acceleration vectors are obtained by merely 
adding the terms z'(t)k and z"(t)k, respectively, to the right-hand members of the two- 
dimensional formulas in (14.25) and (14.26). 

14.19 Exercises 

1. A particle moves in a plane so that its position at time t has polar coordinates r = t, 0 = f. 
Find formulas for the velocity v, the acceleration a, and the curvature k at any time t. 

2. A particle moves in space so that its position at time t has cylindrical coordinates r = t, 

6 = t, z = t. It traces out a curve called a conical helix. 

(a) Find formulas for the velocity v, the acceleration a, and the curvature k at time t. 

(b) Find a formula for determining the angle between the velocity vector and the generator 
of the cone at each point of the curve. 

3. A particle moves in space so that its position at time t has cylindrical coordinates r = sin t, 
6= t, z = log sec t, where 0 < / < t. 
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(a) Show that the curve lies on the cylinder with Cartesian equation x 2 + (y — = {, 

(b) Find a formula (in terms of /) for the angle which the velocity vector makes with k. 

4. If a curve is given by a polar equation r =/(0), where a<8<b<a + 2tt, prove that its 
arc length is 



5. The curve described by the polar equation r = a(l + cos 0), where a > 0 and 0 < 6 < 2tt, 
is called a cardiod. Draw a graph of the cardiod r = 4(1 + cos 0) and compute its arc length. 

6. A particle moves along a plane curve whose polar equation is r = e ce , where c is a constant 
and 6 varies from 0 to 2ir. 

(a) Make a sketch indicating the general shape of the curve for each of the following values 
of c: c = 0, c = 1, c = -1. 

(b) Let L(c ) denote the arc length of the curve and let a(c) denote the area of the region swept 
out by the position vector as 0 varies from 0 to 2 it. Compute L(c) and a(c) in terms of c. 

7. Sketch the curve whose polar equation is r = sin 2 0, 0 < 0 < 2n, and show that it consists 
of two loops . 

(a) Find the area of region enclosed by one loop of the curve. 

(b) Compute the length of one loop of the curve. 

In each of Exercises 8 through 11, make a sketch of the plane curve having the given polar 
equation and compute its arc length. 

8. r = 0, 0 <6 < 7T. 10. r = 1 + cos 8, 0 < 0 < n. 

9. r = e\ 0 < 6 < tt. 11. r = l-cos0, 0<Q<2tt. 

12. If a curve has the polar equation r = /(0), show that its radius of curvature p is given by the 
formula p= + r' 2 ) 3 l 2 /|r 2 — rr" + 2r' 2 |, where r' —f'(6) and r" = 

13. For each of the curves in Exercises 8 through 11, compute the radius of curvature for the 
value of 0 indicated. 

(a) Arbitrary 0 in Exercise 8. (c) 6 = in Exercise 10. 

(b) Arbitrary 6 in Exercise 9. (d) 8 = | n in Exercise 1 1 . 

14. Let i j> denote the angle, 0 <4><ir, between the position vector and the velocity vector of a 
curve. If the curve is expressed in polar coordinates, prove that v sin <f> = r and v cos <j> = 
dr[d6, where v is the speed. 

15. A missile is designed to move directly toward its target. Due to mechanical failure, its direc- 
tion in actual flight makes a fixed angle a t 6 0 with the line from the missile to the target. 
Find the path if it is fired at a fixed target. Discuss how the path varies with a. Does the 
missile ever reach the target? (Assume the motion takes place in a plane.) 

16. Due to a mechanical failure, a ground crew has lost control of a missile recently fired. It is 
known that the missile will proceed at a constant speed on a straight course of unknown 
direction. When the missile is 4 miles away, it is sighted for an instant and lost again. Imme- 
diately an anti-missile missile is tired with a constant speed three times that of the first missile. 
What should be the course of the second missile in order for it to overtake the first one? 
(Assume both missiles move in the same plane.) 

17. Prove that if a homogeneous first-order differential equation of the form y’ = f(x, y) is re- 
written in polar coordinates, it reduces to a separable equation. Use this method to solve 
y’ = (y - x)l(y + x). 

18. A particle (moving in space) has velocity vector v= uikx r, where oj is a positive constant 
and r is the position vector. Prove that the particle moves along a circle with constant angular 
speed (o , (The angular speed is defined to be ]ddjdt\, where 0 is the polar angle at time t.) 

19. A particle moves in a plane perpendicular to the z-axis. The motion takes place along a 
circle with center on this axis. 
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(a) Show that there is a vector w(t) parallel to the z-axis such that 

u(t) = w(t) x r(t), 

where r(t) and v(t) denote the position and velocity vectors at time t. The vector w(t) is called 
the angular velocity vector and its magnitude o(t) = ||w(/)[| is called the angular speed. 

(b) The vector a(t) =w’(t) is called the angular acceleration vector. Show that the accelera- 
tion vector ait) [ = u’(t)] is given by the formula 

a(t) = [w(t) . r(t)]w(t) - oj 2 (t)r(t) + a(t) x r(t) . 

(c) If the particle lies in the xy-plane and if the angular speed oft) is constant, say cuff) = w, 
prove that the acceleration vector a(t) is centripetal and that, in fact, a(t) = —w 2 r(t). 

20. A body is said to undergo a rigid motion if, for every pair of particles p and q in the body, 
the distance \\rjj) — r,(t)\\ is independent of t, where r,(t) and r, (t) denote the position vectors 
of p and q at time t. Prove that for a rigid motion in which each particle p rotates about 
the z-axis we have vjf) = w(t) x r,(t), where w(t) is the same for each particle, and v,(t) is 
the velocity of particle p. 


14.20 Applications to planetary motion 

By analyzing the voluminous data on planetary motion accumulated up to 1600, the 
German astronomer Johannes Kepler (1571-1630) tried to discover the mathematical 
laws governing the motions of the planets. There were six known planets at that time 
and, according to the Copernican theory, their orbits were thought to lie on concentric 
spherical shells about the sun. Kepler attempted to show that the radii of these shells 
were linked up with the five regular solids of geometry. He proposed an ingenious idea 
that the solar system was designed something like a Chinese puzzle. At the center of the 
system he placed the sun. Then, in succession, he arranged the six concentric spheres 
that can be inscribed and circumscribed around the five regular solids-the octahedron, 
icosahedron, dodecahedron, tetrahedron, and cube, in respective order (from inside out). 
The innermost sphere, inscribed in the regular octahedron, corresponded to Mercury’s 
path. The next sphere, which circumscribed the octahedron and inscribed the icosahedron, 
corresponded to the orbit of Venus. Earth’s orbit lay on the sphere around the icosahedron 
and inside the dodecahedron, and so on, the outermost sphere, containing Jupiter’s 
orbit, being circumscribed around the cube. Although this theory seemed correct to 
within five percent, astronomical observations at that time were accurate to a percentage error 
much smaller than this, and Kepler finally realized that he had to modify this theory. 
After much further study it occurred to him that the observed data concerning the orbits 
corresponded more to elliptical paths than the circular paths of the Copernican system. 
After several more years of unceasing effort, Kepler set forth three famous laws, empiri- 
cally discovered, which explained all the astronomical phenomena known at that time. 
They may be stated as follows: 

Kepler' s first law. Planets move in ellipses with the sun at one focus. 

Kepler's second law. The position vector from the sun to a planet sweeps out area at a 
constant rate. 
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Kepler's third law: The square of the period of a planet is proportional to the cube of its 
mean distance from the sun. 

Note; By the period of a planet is meant the time required to go once around the 
elliptical orbit. The mean distance from the sun is one half the length of the major axis 
of the ellipse. 

The formulation of these laws from a study of astronomical tables was a remarkable 
achievement. Nearly 50 years later, Newton proved that all three of Kepler’s laws are 
consequences of his own second law of motion and his celebrated universal law of gravi- 
tation. In this section we shall use vector methods to show how Kepler’s laws may be 
deduced from Newton’s. 



figure 14.19 The position vector from the sun to a planet. 

Assume we have a fixed sun of mass M and a moving planet of mass m attracted to 
the sun by a force F. (We neglect the influence of all other forces.) Newton’s second law 
of motion states that 

(14.28) F=ma, 

where a is the acceleration vector of the moving planet. Denote by r the position vector 
from the sun to the planet (as in Figure 14.19), let r = |[rj|, and let u r be a unit vector with 
the same direction as r, so that r = ru, The universal law of gravitation states that 


F = 



u r , 


where G is a constant. Combining this with (14.28), we obtain 


(14.29) 


a 


G M 



which tells us that the acceleration is radial. In a moment we shall prove that the orbit 
lies in a plane. Once we know this, it follows at once from the results of Section 14.17 that 
the position vector sweeps out area at a constant rate. 
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To prove that the path lies in a plane we use the fact that r and a are parallel. If we 
introduce the velocity vector v = drjdt, we have 

dv , dv , dr d , 

rxa = rx— + vxv = rx-r + -rxv = — (r x v) . 

dt dt dt dt 

Since r x a = 0, this means that r x v is a constant vector, say r x v = c. 

If c = 0, the position vector r is parallel to v and the motion is along a straight line. 
Since the path of a planet is not a straight line, we must have c ^ 0. The relation r x v = c 
shows that r > c = 0, so the position vector lies in a plane perpendicular to c. Since the 
acceleration is radial, r sweeps out area at a constant rate. This proves Kepler’s second 
law. 

It is easy to prove that this constant rate is exactly half the length of the vector c. In 
fact, if we use polar coordinates and express the velocity in terms of u r and u g as in Equation 
(14.25), we find that 

. (dr dB \ 9 dd 

(14.30) c = t X v = (ru,) x 1 u r + r fa u e I = r — u r X u 0 , 


and hence ||c|[ = \r 2 d6ldt\. By (14.27) this is equal to 2\A'(t)\, where A’(t) is the rate at 
which the radius vector sweeps out area, 

Kepler’s second law is illustrated in Figure 14.20. The two shaded regions, which are 
swept out by the position vector in equal time intervals, have equal areas. 

We shall prove next that the path is an ellipse. First of all, we form the cross product 
axe, using (14.29) and (14.30), and we find that 


w , GM . 
a x c — ( - — u r jx 


Since a - dvjdt and u e = 
as follows : 


Integration gives 


/ 2 dd \ dd dd 

V dt Ur X Ue ) = ~ GM dt Ur X (*r x ««) = GM J t u e- 

dujdd, the foregoing equation for axe can also be written 

J t (v X c)-j t ( GMu r ) . 
v X c = GMu r + b , 


where b is another constant vector. We can rewrite this as follows: 


(14.31) 


v x c = GM(u r + e) . 


where GMe= b. We shall combine this with (14.30) to eliminate v and obtain an equation 
for r. For this purpose we dot multiply both sides of (14.30) by c and both sides of (14.31) 
by r. Equating the two expressions for the scalar triple product r < v x c, we are led to the 
equation 

(14.32) GMr{\ + e cos cf>) = c 2 , 

where e = ||e||, c = ||c||, and 0 represents the angle between the constant vector e and the 
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radius vector r. (See Figure 14.21.) If we let d = c 2 l(GMe), Equation (14.32) becomes 

(14.33) r — — or r = e(d — r COS 6) ■ 

e cos <p + 1 

By Theorem 13.18, this is the polar equation of a conic section with eccentricity e and a 
focus at the sun. Figure 14.21 slhows the directrix drawn perpendicular to e at a distance 
d from the sun. The distance from the planet to the directrix is d — r cos <j>, and the ratio 




Figure 14.20. Kepler’s second law. The Figure 14.21. The ratio rfd — r cos f) 

two shaded regions, swept out in equal time is the eccentricity e = ||e||, 

intervals, have equal areas. 


r/(d — r cos f) is the eccentricity e. The conic is an ellipse if e < 1, a parabola if e = 1, 
and a hyperbola if e > 1. Since planets are known to move on closed paths, the orbit 
under consideration must be an ellipse. This proves Kepler’s first law. 

Finally, we deduce Kepler’s third law. Suppose the ellipse has major axis of length 2a 
and minor axis of length 2b. Then the area of the ellipse is nab. Let T be the time it takes 
for the planet to go once around, the ellipse. Since the position vector sweeps out area at 
the rate 1c, we have \cT = nab, or T = Inabjc. We wish to prove that T 2 is proportional 
to fl 3 . 

From Section 13.22 we have b 2 = a 2 ( 1 — e 2 ), ed = a{\ — C 2 ), so 

c 2 = GMed = GMa{ 1 - e 2 ) , 

and hence we have 

T 2 = 4-77 2 q 2 fo 2 _ 477 2 a 4 (l - e 2 ) 4tt 2 3 
c 2 GMa( 1 — e 2 ) G M a ' 

Since T 2 is a constant times a 3 , tlhis proves Kepler’s third law. 
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14.21 Miscellaneous review exercises 

1. Let r denote the vector from the origin to an arbitrary point on the parabola y 2 = x, let a 
be the angle that r makes with the tangent line, 0 <, a ^ tt, and let 0 be the angle that r makes 
with the positive x-axis, 0 < 0 < tt, Express a in terms of 0. 

2. Show that the vector T = yi + 2cj is tangent to the parabola y 2 = 4 cx at the point (x, y), 
and that the vector N = 2 ci «- yj is perpendicular to T. 

[Hint: Write a vector equation for the parabola, using y as a parameter.] 

3. Prove that an equation of the line of slope m that is tangent to the parabola y 2 = 4 cx can 
be written in the form y = mx + c/m. What are the coordinates of the point of contact? 

4. (a) Solve Exercise 3 for the parabola (y — y 0 ) 2 = 4 c(x — x Q ). 

(b) Solve Exercise 3 for the parabola x 2 = 4cj and, more generally, for the parabola 
(x - x 0 ) 2 = 4c{y - jo). 

5. Prove that an equation of the line that is tangent to the parabola y 2 = 4cx at the point 
(xi , jJ can be written in the form y ± y = 2 c(x + tq). 

6. Solve Exercise 5 for each of the parabolas described in Exercise 4. 

7. (a) Let P be a point on the parabola y = x 2 . Let Q be the point of intersection of the normal 
line at P with the y-axis. What is the limiting position of Q as P tends to the y-axis? 

(b) Solve the same problem for the curve y = f(x), wheref’ (0) = 0. 

8. Given that the line y = c intersects the parabola y = x 2 at two points. Find the radius of 
the circle passing through these two points and through the vertex of the parabola. The radius 
you determine depends on c. What happens to this radius as c -+ O? 

9. Prove that a point (x„ , y„) is inside, on, or outside the ellipse x 2 ja 2 + y 2 jb 2 = 1 according as 
x 2 /a 2 + y 2 lb 2 is less than, equal to, or greater than 1. 

10. Given an ellipse x 2 ja 2 + y 2 jb 2 = 1. Show that the vectors T and N given by 


T = 


_ y_ . * . 

b 2 a 21. 


x 


J . 

b 2j 


are, respectively, tangent and normal to the ellipse when placed at the point (x, y). If the 
eccentric angle of (x 0 , y„) is 0 O , show that the tangent line at (x 0 , y„) has the Cartesian equation 


+ 


£ sin 00 = 1 


1 1 . Show that the tangent line to the ellipse x 2 ja 2 + y 2 jb 2 = 1 at the point (x 0 , y 0 ) has the 
equation x 0 xla 2 + y 0 y/b 2 = 1. 

12. Prove that the product of the perpendicular distances from the foci of an ellipse to any tangent 
line is constant, this constant being the square of the length of half the minor axis. 

13. Two tangent lines are drawn to the ellipse x 2 + 4 y 2 = 8, each parallel to the line x -f 2y = 7. 
Find the points of tangency. 

14. A circle passes through both foci of an ellipse and is tangent to the ellipse at two points. 
Find the eccentricity of the ellipse. 

15. Let Kbe one of the two vertices of a hyperbola whose transverse axis has length 2 a and whose 
eccentricity is 2. Let P be a point on the same branch as 1/. Denote by A the area of the 
region bounded by the hyperbola and the line segment VP, and let r be the length of VP . 

(a) Place the coordinate axes in a convenient position and write m equation for the hyperbola. 

(b) Express the area A as an integral and, without attempting to evaluate this integral, show 
that Ar~ 3 tends to a limit as the point P tends to V. Find this limit. 
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16. Show that the vectors T = ( y/b 2 )i + ( x/a 2 )j and N = (x/a 2 )i — ( ylb 2 )j are, respectively, tan- 
gent and normal to the hyperbola x 2 ja 2 — y 2 lb 2 = 1 if placed at the point (x, y) on the curve. 

17. Show that the tangent line to the hyperbola x 2 /a 2 — y 2 /b 2 = 1 at the point(x 0 , y 0 ) is given 
by t he equation x 0 x/o 2 - y 0 yjb 2 = 1. 

18. The normal line at each point of a curve and the line from that point to the origin form an 
isosceles triangle whose base is on the x-axis. Show that the curve is a hyperbola. 

19. The normal line at a point P of a curve intersects the x-axis at X and the y-axis at Y. Find 
the curve if each P is the mid-point of the corresponding line segment XY and if the point 
(4, 5) is on the curve. 

20. Prove that the product of the perpendicular distances from an arbitrary point on a hyperbola 
to its asymptotes is constant. 

21. A curve is given by a polar equation r =f (0), Findfif an arbitrary arc joining two distinct 
points of the curve has arc length proportional to (a) the angle subtended at the origin; (b) 
the difference of the radial distances from the origin to its endpoints; (c) the area of the sector 
formed by the arc and the radii to its endpoints. 

22. If a curve in 3-space is described by a vector-valued function r defined on a parametric in- 
terval [a, b], prove that the scalar triple product r’(t) . r(a) x r(h) is zero for at least one t in 
(a, b). Interpret this result geolmetrically. 
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15.1 Introduction 

Throughout this book we have encountered many examples of mathematical objects 
that can be added to each other and multiplied by real numbers. First of all, the real 
numbers themselves are such objects. Other examples are real-valued functions, the complex 
numbers, infinite series, vectors in n-space, and vector-valued functions. In this chapter we 
discuss a general mathematical concept, called a linear space, which includes all these 
examples and many others as special cases. 

Briefly, a linear space is a set of elements of any kind on which certain operations (called 
addition and multiplication by numbers) can be performed. In defining a linear space, we 
do not specify the nature of the elements nor do we tell how the operations are to be per- 
formed on them. Instead, we require that the operations have certain properties which 
we take as axioms for a linear space. We turn now to a detailed description of these axioms. 


15.2 The definition of a linear space 

Let V denote a nonempty set of objects, called elements. The set V is called a linear 
space if it satisfies the following ten axioms which we list in three groups. 

Closure axioms 

axiom 1, closure under addition. For every pair of elements x and y in V there 
corresponds a unique element in V called the sum of x and y, denoted by x + y. 

AXIOM 2. CLOSURE UNDER MULTIPLICATION BY REAL NUMBERS. For BVeTy X in V and 

every real number a there corresponds an element in 1/ called theproduct of a and x, denoted 
by ax. 

Axioms for addition 

axiom 3. commutative law. For all x and y in V, we have x + y = y + x. 

axiom 4. associative law. For all x,y, and z in V, we have (x + y) + z ~ x + (y + z). 


5.51 
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AXIOM 5. EXISTENCE OF ZERO E ; LEMENT . There is an element in V, denoted by 0, such that 

x + 0 = x for all x in V . 

axiom 6. existenceof negatives. For e very x in V, the element (— l)x has the property 

x + (-l)x = 0. 

Axioms for multiplication by numbers 

axiom 7. associative law. For every x in V and all real numbers a and b, we have 

a(bx) = (ab)x . 

AXIOM 8. DISTRIBUTIVE LAW FOR ADDITION IN V. For all x andy in V and all real a, 

we have 

a(x + y) = ax + ay . 


AXIOM 9. DISTRIBUTIVE LAW FOR ADDITION OF NUMBERS. FOf all % /ll 1/ 9HC/ (] [j fgal 

a and b, we have 

(a + b)x = ax + bx . 


axiom 10. existence of identity. For every x in V, it'd have lx = x. 


Linear spaces, as defined above, are sometimes called real linear spaces to emphasize 
the fact that we are multiplying the elements of 1/ by real numbers. If real number is re- 
placed by complex number in Axioms 2, 7, 8, and 9, the resulting structure is called a com- 
plex linear space. Sometimes a linear space is referred to as a linear vector space or simply 
a vector space; the numbers used, as multipliers are also called scalars. A real linear space 
has real numbers as scalars; a complex linear space has complex numbers as scalars. 
Although we shall deal primarily with examples of real linear spaces, all the theorems are 
valid for complex linear spaces as well. When we use the term linear space without further 
designation, it is to be understood that the space can be real or complex. 


15.3 Examples of linear spaces 

If we specify the set V and tell how to add its elements and how to multiply them by 
numbers, we get a concrete example of a linear space. The reader can easily verify that each 
of the following examples satisfies all the axioms for a real linear space. 

example 1. Let 1/ = R, the set of all real numbers, and let x + y and ax be ordinary 
addition and multiplication of real numbers. 

example 2. Let V — C, the set of all complex numbers, define x + y to be ordinary 
addition of complex numbers, artd define OX to be multiplication of the complex number x 
by the real number a. Even though the elements of 1/ are complex numbers, this is a real 
linear space because the scalars are real. 




Examples of linear spaces 


553 


example 3. Let V — V r , the vector space of all n-tuples of real numbers, with addition 
and multiplication by scalars defined in the usual way in terms of components. 

example 4. Let V be the set of all vectors in V n orthogonal to a given nonzero vector 
N. If n = 2, this linear space is a line through 0 with N as a normal vector. If n = 3, 
it is a plane through 0 with N as normal vector. 

The following examples are called function spaces. The elements of 1/ are real-valued 
functions, with addition of two functions f and g defined in the usual way: 

(/ + g)( x ) = f( x ) + g(x) 

for every real x in the intersection of the domains off and g. Multiplication of a function 
f by a real scalar a is defined as follows: af is that function whose value at each x in the 
domain of f is af(x). The zero element is the function whose values are everywhere zero. 
The reader can easily verify that each of the following sets is a function space. 

example 5. The set of all functions defined on a given interval. 

example 6. The set of all polynomials. 

example 7. The set of all polynomials of degree < n, where n is fixed. (Whenever we 
consider this set it is understood that the zero polynomial is also included.) The set of 
all polynomials of degree equal to n is not a linear space because the closure axioms are not 
satisfied. For example, the sum of two polynomials of degree n need not have degree n, 

EXAMPLE 8. The set of all functions continuous on a given interval. If the interval is 
[a, b\, we denote this space by C(a, 6). 

example 9. The set of all functions differentiable at a given point. 

example 10. The set of all functions integrable on a given interval. 

example 11. The set of all functions f defined at 1 with f(l) = 0. The number 0 is 
essential in this example. If we replace 0 by a nonzero number c, we violate the closure 
axioms. 


example 12. The set of all solutions of a homogeneous linear differential equation 
y” + ay' + by=0, where a and b are given constants. Here again 0 is essential. The set 
of solutions of a nonhomogeneous differential equation does not satisfy the closure 
axioms. 

These examples and many others illustrate how the linear space concept permeates 
algebra, geometry, and analysis. When a theorem is deduced from the axioms of a linear 
space, we obtain, in one stroke, a result valid for each concrete example. By unifying 
diverse examples in this way we gain a deeper insight into each. Sometimes special knowl- 
edge of one particular example helps to anticipate or interpret results valid for other 
examples and reveals relationships which might otherwise escape notice. 
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15.4 Elementary consequences of the axioms 

The following theorems are easily deduced from the axioms for a linear space. 

theorem 15.1. uniqueness of the zero element. In any linear space there is one 
and only one zero element. 

Proof. Axiom 5 tells us that there is at least one zero element. Suppose there were two, 
say 0, and 0,. Taking x = 0 1 and 0 = 0, in Axiom 5, we obtain 0 1 + 0, = 0,. 
Similarly, taking x = 0, and 0 = O x , we find 0, + 0, = 0,. But 0, + 0, = 0, + 0, 
because of the commutative law, so 0, = 0, . 

theorem 15.2. uniqueness of negative elements. In any linear space every element 
has exactly one negative. That is, for every x there is one and only one y such that x + y = 0. 

Proof. Axiom 6 tells us that each x has at least one negative, namely (- ljx. Suppose 
x has two negatives, say y x and y , t , . Then x + y 1 = 0 and x + y 2 = 0. Adding y 2 to both 
members of the first equation and using Axioms 5, 4, and 3, we find that 

y 2 + (x + yj = y 2+ 0 = _y 2 , 

and 

T 2 + ( x + Ji) = (T 2 + x) + yl = 0 + y, = y x + 0 = y 1 . 

Therefore y 1 = y„ so x has exactly one negative, the element (- l)x. 

Notation. The negative of x is denoted by — x. The difference y — x is defined to be 
the sum y + (-x). 

The next theorem describes a number of properties which govern elementary algebraic 
manipulations in a linear space. 

theorem 15.3. In a given linear space, let x and y denote arbitrary elements and let 
a and b denote arbitrary scalars. Then we have the following properties: 

(a) Ox = 0. 

(b) a0 = 0. 

(c) (-a)x = -(ax) = a(-x). 

(d) If ax = 0, then either a = 0 or x = 0. 

(e) If ax = ay and a^O, then x = y. 

(f) If ax = bx and x^0, then a = b. 

(g) -(* + y) = (-*) + i-y) = -x - y . 

(h ) x + x = 2x, x + x + x ~ 3x, andin general, 1 x = nx. 


We shall prove (a), (b), and (c) and leave the proofs of the other properties as exercises, 
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Proof of (a). Let z = Ox. We wish to prove that z = 0. Adding z to itself and using 
Axiom 9, we find that 

z + z = Ox + Ox = (0 + 0)x = Ox = z . 

Now add — z to both members to get z = 0. 

Proof of(b). Let z = aO, add z to itself, and use Axiom 8. 

Proof of(c). Let z = (-a)x. Adding z to ax and using Axiom 9, we find that 

z + ax - (-a)x + ax - (-a + a)x = 0 X = O , 

so 2 is the negative of ax, z = -(ax). Similarly, if we add a(-x) to ax and use Axiom 8 and 
property (b), we find that a(-x) - -(ax). 

15.5 Exercises 

In Exercises 1 through 28, determine whether each of the given sets is a real linear space, if 
addition and multiplication by real scalars are defined in the usual way. For those that are not, 
tell which axioms fail to hold. The functions in Exercises 1 through 17 are real-valued. In Exer- 
cises 3, 4, and 5, each function has domain containing 0 and 1. In Exercises 7 through 12, each 
domain contains all real numbers. 

1. All rational functions. 

2. All rational functions //£■, with the degree off < the degree of g (including f = 0). 

3. All/ with /(0) =/(l). 8. All even functions. 

4. Allfwith 2/(0) =/(l). 9. All odd functions. 

5. All/with/(l) = 1 +/(0). 10. All bounded functions. 

6. All step functions defined on [0, 1], 11. All increasing functions. 

7. All / with f(x) -> 0 as x ->■ -f oo. 12. All functions with period 277. 

13. Allfintegrable on [0, 1] with JJ/(x) dx = 0. 

14. Allfintegrable on [0, 1] with JJ/(x) dx > 0. 

15. All / satisfying /(x) = /(I - x) for all x 

16. All Taylor polynomials of degree < n for a fixed n (including the zero polynomial). 

17. AH solutions of a linear second-order homogeneous differential equation y” + P(x)y' + 
Q(x)y = 0, where P and Q are given functions, continuous everywhere. 

18. All bounded real sequences. 20. All convergent real series. 

19. All convergent real sequences. 21. All absolutely convergent real series. 

22. All vectors (x, y, z) in V s with z = 0. 

23. All vectors (x, y, z) in V 3 with x = 0 or y = 0. 

24. All vectors (x, y, z) in V 3 with y = 5x. 

25. All vectors (x, y, z) in V 3 with 3x + 4y = 1, z = 0. 

26. All vectors (x, y, z) in V 3 which are scalar multiples of (1, 2, 3). 

27. All vectors (x, y, z) in V 3 whose components satisfy a system of three linear equations of the 

form : 

#llX + # 12 y + # 1 ;J- — 0 . d^X + #22 y + #23 z = °r # 3 iX + a 33 y + # 33 z = 0. 

28. All vectors in V n that are linear combinations of two given vectors A and B. 
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29. Let V = R + , the set of positive real numbers. Define the “sum” of two elements x and y in 
V to be their product x ■ y (in the usual sense), and define “multiplication” of an element x 
in V by a scalar c to be x 1 '. Prove that Vis a real linear space with 1 as the zero element. 


30. (a) Prove that Axiom 10 can be deduced from the other axioms. 

(b) Prove that Axiom 10 cannot be deduced from the other axioms if Axiom 6 is replaced by 
Axiom 6’: For every x in Y there is an element y in V SUChthat x + y = 0. 

31. Let S be the set of all ordered pairs fa , x 2 ) °f rea l numbers. In each case determine whether 
or not S is a linear space with the operations of addition and multiplication by scalars defined 
as indicated. If the set is not a linear space, indicate which axioms are violated. 

(a) fa , x 2 ) + fa , y 2 ) = fa + y t , x 2 + y 2 ), afa , x 2 ) = (ax 1 , 0). 

(b) fa , x 2 ) + fa , y 2 ) = fa + >’i , 0), afa , x 2 ) - (ax 1 , ax 2 ). 

(c) fa , x 2 ) + fa , y 2 ) = fa , x 2 + y 2 ), afa , x 2 ) = {ax 1 , ax 2 ). 

(d) fa ,x 2 ) + fa , y 2 ) = (fa + x 2 |, | fa + y 2 1), a(x 1 , * 2 ) = (laxj, \ax 2 \). 

32. Prove parts (d) through (h) of Theorem 15.3. 


15.6 Subspaces of a linear space 

Given a linear space V, let S be a nonempty subset of V. If S is also a linear space, with 
the Same operations of addition and multiplication by scalars, then S is called a subspace 
of V. The next theorem gives a simple criterion for determining whether or not a subset of 
a linear space is a subspace. 

theorem 15.4. Let S be a nonempty subset of a linear space V. Then S is a subspace 
if and only if S satisfies the closure axioms. 

Proof. If S is a subspace, it satisfies all the axioms for a linear space, and hence, in 
particular, it satisfies the closure axioms. 

Now we show that if S satisfies the closure axioms it satisfies the others as well. The 
commutative and associative laws for addition (Axioms 3 and 4) and the axioms for 
multiplication by scalars (Axioms 7 through 10) are automatically satisfied in S because 
they hold for all elements of V. 1 t remains to verify Axioms 5 and 6, the existence of a zero 
element in S, and the existence of a negative for each element in S. 

Let x be any element of S. (S has at least one element since S is not empty.) By Axiom 
2, ax is in S for every scalar a. Taking a = 0, it follows that Ox is in S. But Ox = 0, by 
Theorem 15.3(a), so 0 £ S, and Axiom 5 is satisfied. Taking a = — 1, we see that (- l)x 
is in S. But x + (- l)x = 0 since both x and (— l)x are in V. so Axiom 6 is satisfied in 
S. Therefore S is a subspace of V. 

DEFINITION . Let S be a nonempty subset of a linear space V. An element x in V of the 

form 

k 

X = 2 ^’ 

where x l , . . . , x k are all in S and c t , . . . , c k are scalars, is called a finite linear combination 
of elements of S. The set of all finite linear combinations of elements of S satisfies the 
closure axioms and hence is a subspace of M. l/l/e call this the subspace spanned by S, or the 
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linear span of S, and denote it by L (S). If S is empty, we define L (S) to be {D}, the set con- 
sisting of the zero element alone 

Different sets may span the same subspace. For example, the space V a is spanned by 
each of the following sets of vectors: {/, j } , {i, j, i + j}, { 0 , i, —i, j, —j, i + j }. The space of 
all polynomials p(t) of degree < n is spanned by the set of II + 1 polynomials 

{U 

It is also spanned by the set {1, f/2, t 2 j 3, . . . , t n j(n + 1)}, and by {1, (1 + t), (1 + t) 2 , . . . , 
(1 + f)"}. The space of all polynomials is spanned by the infinite set of polynomials 

(1, f, t\ . . 

A number of questions arise naturally at this point. For example, which spaces can be 
spanned by a finite set of elements? If a space can be spanned by a finite set of elements, 
what is the smallest number of elements required? To discuss these and related questions, 
we introduce the concepts of dependence, independence, bases, and dimension, These 
ideas were encountered in Chapter 12 in our study of the vector space V„ . Now we extend 
them to general linear spaces. 


15.7 Dependent and independent sets in a linear space 

definition . A set S of elements in a linear space V is called dependent if there is a finite 
set of distinct elements in S, say x x ,. . . , x k , and corresponding set of scalars c 1 ,...,c k , 
not all zero, such that 

k 

2 c i x i = 0 • 

i—1 

The set S is called independent fit is not dependent. In this case, for all choices of distinct 
elements x lt . x k in S and scalars c x , ■ ■ ■ - c k , 

k 

2 = 0 implies c x = c 2 = - ■ ■ = c k = 0 . 

i = 1 

Although dependence and independence are properties of sets of elements, we also apply 
these terms to the elements themselves. For example, the elements in an independent set 
are called independent elements. 

If 5 is a finite set, the foregoing definition agrees with that given in Chapter 12 for the 
space V„ . However, the present definition is not restricted to finite sets. 

example 1. If a subset T of a set S is dependent, then S itself is dependent. This is 
logically equivalent to the statement that every subset of an independent set is independent. 

example 2. If one element in S' is a scalar multiple of another, then S is dependent. 

example 3. If 0 £ S, then 5 is dependent. 

example 4. The empty set is independent. 
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Many examples of dependent and independent sets of vectors in V„ were discussed in 
Chapter 12. The following examples illustrate these concepts in function spaces. In each 
case the underlying linear space Vis the set of all real-valued functions defined on the real 
line. 


example 5. Let uft) = COS 2 t, uft) = sin 2 /, w 3 (/) = 1 for all real /, The Pythagorean 
identity shows that u 1 + n 2 — u 3 = 0, so the three functions u 1 , w 2 , fl 3 are dependent. 

example 6. Let u k (t) = t k for k = 0, 1,2,. . . , and / real. The set S = {t/ 0 , iq , w 2 , . . .} is 
independent. To prove this, it suffices to show that for each n the n + 1 polynomials 
W 0 , Uy , . . , , u n are independent. A relation of the form c k u k = 0 means that 


(15.1) I c k t k = 0 

k=0 

for all real /.When / = 0, this gives c 0 = 0. Differentiating (15.1) and setting / = 0, 
we find that c k = 0. Repeating the process, we find that each coefficient c k is zero. 

example 7. If a,,..., a, are distinct real numbers, the n exponential functions 

ufx) = e aiX , • ■ . , u,(x) = e UnX 

are independent. We can prove this by induction on n. The result holds trivially when 
n = 1. Therefore, assume it is true for « — 1 exponential functions and consider scalars 
, . . . , c n such that 

n 

(15.2) 1c k e v = 0. 

k=l 

Let w be the largest of the n numbers a, a, . Multiplying both members of (15.2) 

by e~ aMX , we obtain 

n 

(15.3) J i c k e u "‘- a « )x = 0. 

i 

If k ^ M, the number a k — a, is negative. Therefore, when x — > + 00 in Equation (15.3), 
each term with k ^ A4 tends to zero and we find that c M = 0. Deleting the Mth term from 
(15.2) and applying the induction hypothesis, we find that each of the remaining n — 1 
coefficients C k is zero, 

theorem 15.5. Let S be an independent set consisting ofk elements in a linear space V 
and let L(S) be the subspace spanned by S, Then every set of k + 1 elements in L(S) is 
dependent. 

Proof. When V = V n . Theorem 15.5 reduces to Theorem 12.8. If we examine the proof 
of Theorem 12.8, we find that it is based only on the fact that V n is a linear space and not 
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on any other special property of V n . Therefore the proof given for Theorem 12.8 is valid 
for any linear space V. 


15.8 Bases and dimension 

definition. A finite set S of elements in a linear space V is called a finite basis for V 
if S is independent and spans V. The space V is called finite dimensional if it has a finite 
basis, or if V consists of 0 alone. Otherwise V is called intnite dimensional. 


THEOREM 15.6. Let V be a finite-dimensional linear space. Then every finite basis for V 
has the same number of elements. 

Proof Let S and T be two finite bases for V. Suppose S consists of k elements and T 
consists of m elements. Since S is independent and spans V, Theorem 15.5 tells us that 
every set of k + 1 elements in Vis dependent. Therefore, every set of more than k elements 
in V is dependent. Since T is an independent set, we must have m < k. The same argu- 
ment with S and T interchanged shows that k < m. Therefore k = m. 


definition. If a linear space V has a basis of n elements, the integer n is called the 
dimension ofV. We write n = dim V. If V = {Oj, we say V has dimension 0. 

example 1. The space V n has dimension n. One basis is the set of n unit coordinate 
vectors. 


example 2. The space of all polynomials p(t) of degree < n has dimension n + 1. One 
basis is the set of n + 1 polynomials {1 ,t, t 2 , , . . , t n }. Every polynomial of degree < n is a 
linear combination of these n + 1 polynomials. 

example 3. The space of solutions of the differential equation y” — 2 y' — 3y = 0 has 
dimension 2. One basis consists of the two functions ufx) = e~ x , ufx) = e Sx . Every 
solution is a linear combination of these two. 

example 4. The space of all polynomials p(t) is infinite-dimensional. Although the 
infinite set (1 ,t, t 2 , . . . } spans this space, no finite set of polynomials spans the space. 

theorem 15.7. Let V be a finite-dimensional linear space with dim V = n. Then we 
have the following \ 

(a) Any set of independent elements in V is a subset of some basis for V. 

(b) Any set of n independent elements is a basis for V. 

Proof. The proof of (a) is identical to that of part (b) of Theorem 12.10. The proof of 
(b) is identical to that of part (c) of Theorem 12.10. 


Let V be a linear space of dimension n and consider a basis whose elements e 1 , ■ ■ ■ , e n 
are taken in a given order. We denote such an ordered basis as an n-tuple (e lf .... , e,). 




560 


Linear spaces 


If x £ V, we can express x as a linear combination of these basis elements: 


(15.4) x = £ eft . 

The coefficients in this equation determine an n-tuple of numbers (c y . . . . , c,) that is 
uniquely determined by X. In fact, if we have another representation of x as a linear 
combination of e 1 , . . . , e n , say x = d i e i , then by subtraction from (15.4), we find that 
2" = i ( c , — d i )e i = 0. But since the basis elements are independent, this implies c t = d { 
for each i, SO we have (c lt ■ ■ ■ , c,) = (d 1 , - ■ ■ , d n ). 

The components of the ordered n-tuple (c y , . . . , c,) determined by Equation (15.4) are 
called the components of x relative to the ordered basis (e , , . . . , e,). 


15.9 Exercises 

In each of Exercises 1 through 10, let S denote the set of all vectors (x,y,z) in Y, whose com- 


ponents satisfy the condition given. Determine 
compute dim S. 

1. x =o. 

2. x + y = 0. 

3. x + y + z = 0, 

4. x = y. 

5. x =y =z. 


whether S is a subspace of Kg . It S is a subspace, 

6. x = y or x = z. 

7. x 2 - / = 0. 

8. x +y = 1. 

9. y = 2x and z = 3x. 

10. x + y + z = 0 and x — y — z — 0. 


Let P n denote the linear space of all real polynomials of degree < /;, where n is fixed. In each 
of Exercises 1 1 through 20, let S denote the set of all polynomialsfin P satisfying the condition 
given. Determine whether or not S is a subspace of P n . If S is a subspace, compute dim S. 

11. f(0) = 0. 16. f(0) = /( 2). 

12. f (0) = 0. 17. fis even. 

13. /"(0) = 0. 18. fis odd. 

14. f(0) + f'{ 0) = 0. 19. / has degree < k, where k < n, or f = 0. 

15. f(0) =/(l). 20. f has degree k, where k < n,Or f = 0. 

21. In the linear space of all real polynomials p(t), describe the subspace spanned by each of the 
following subsets of polynomials and determine the dimension of this subspace. 

(a) (1» t 2 , f 4 }; (b (c (d) {1 + t, (1 + f) 2 }. 

22. In this exercise, L(S) denotes the subspace spanned by a subset S of a linear space E. Prove 
each of the statements (a) through (f). 

(a) S c L(S). 


(b) If S c t c V and if T is a subspace of y then L(S) c T. This property is described 
by saying that L(S) is the smallest subspace of (/which contains S. 

(c) A subset S of Vis a subspace of Vif and only if L(S) = S. 

(d) If S £ t £ V, then L(S) C L(T). 

(e) If S and Tare subspaces of V. then so is S O T. 

(f) If S and Tare subsets of V, then L{S H T) CZL(S) n UT). 

(g) Give an example in which L(S H T) ^L(S) Pi L(T>. 

23. Let V be the linear space consisting of all real-valued functions defined on the real line. 
Determine whether each of the following subsets of Vis dependent or independent. Compute 
the dimension of the subspace spanned by each set. 

(a) (1, e ax , e bx ], a ^b. (c) { l, e ax , «“}. 

(b) {eP x , xe ax }. (d) {e ax , xe ax , x 2 e ax }. 
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(e) {e x , e~ x , cosh x}. (h) { 1 , cos 2x, sin 2 x}. 

(f) {cos x, sin x}. (i) {sin x, sin 2x}. 

(g) {cos 2 x, sin 2 x}. (j) {e x COS x, e~ x sin x } . 

24. Let y be a finite -dimensional linear space, and let S he a subspace of V. Prove each of the 
following statements. 

(a) S is finite dimensional and dim S < dim V. 

(b) dim S = dint Vif and only if S = V. 

(c) Every basis for S is part of a basis for V. 

(d) A basis for V need not contain a basis for S. 


15.10 Inner products, Euclidean spaces. Norms 

In ordinary Euclidean geometry, those properties that rely on the possibility of measuring 
lengths of line segments and angles between lines are called metric properties. In our study 
of V n , we defined lengths and angles in terms of the dot product. Now we wish to extend 
these ideas to more general linear spaces. We shall introduce first a generalization of the 
dot product, which we call an innerproduct, and then define length and angle in terms of the 
inner product. 

The dot product x 1 y of two vectors x = (Xj , . . . , x„) and y = (}’i , in V n was 

defined in Chapter 12 by the formula 

n 

(15.5) x : y = X x Ji . 

i~ 1 

In a general linear space, we write (x, y) instead of x • y for inner products, and we define 
the product axiomatically rather than by a specific formula. That is, we state a number of 
properties we wish inner products to satisfy and we regard these properties as axioms. 


DEFINITION . 


A real linear space V is said to have an inner product if for each pair of 
elements x and y in V there corresponds a unique real number (x, y) satisfying the following 
axioms for all choices of x, y, z in V and all real scalars c. 

(1) (x,j) = (y, x) (commutativity, or symmetry). 

(2) (x, y + z) = (x, y) + (x, z) (distributivity, or linearity). 


(3) c(x, y) = (ex, y) 

(4) (x, x) > 0 if x ^ 0 


(associativity, or homogeneity), 
(positivity). 


A real linear space with an inner product is called a real Euclidean space. 

Note: Taking c = 0 in (3), we find that (0, y) = 0 for all y. 

In a complex linear space, an inner product (x, y) is a complex number satisfying the 
same axioms as those for a real inner product, except that the symmetry axiom is replaced 
by the relation 

(T) (x, j) = (y, x) , 

where (y, x) denotes the complex conjugate of (y, x). In the homogeneity axiom, the scalar 
multiplier c can be any complex number. From the homogeneity axiom and (I'), we get 
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the companion relation 


(x, cy) = {cy, x) = c(y, x) = c(x, y) . 


A complex linear space with an inner product is called a complex Euclidean space. 
(Sometimes the term unitary space is also used.) One example is complex vector space 
V n (C) discussed briefly in Section 12.16. 

Although we are interested primarily in examples of real Euclidean spaces, the theorems 
of this chapter are valid for complex Euclidean spaces as well. When we use the term 
Euclidean space without further designation, it is to be understood that the space can be 
real or complex. 

The reader should verify that each of the following satisfies all the axioms for an inner 
product. 


example 1 , In V „ let (x, y) = x < y, the usual dot product of x and 

example 2 . If X = (X„ x 2 ) and y = ( Vi , J 2 ) are any two vectors in V 2 , define (x, y) by 
the formula 

(x, y) = lx x y x + x x y\ + x 2 Vi + x 2 y 2 . 


This example shows that there may be more than one inner product in a given linear space. 

example 3. Let C(a, b ) denote the linear space of all real-valued functions continuous 
on an interval [a, b]. Define an inner product of two functionsfand g by the formula 

(/> g) =f f(0g(0 dt. 

J a 


This formula is analogous to Equation (15.5) which defines the dot product of two vectors 
in V n . The function values /(/) and g(t) play the role of the components x, and \' i , and 
integration takes the place of summation. 

example 4 . In the space C(a, b), define 

(/, g) =/ vv(0/ (0g(0 dt , 

d a 


where w is a fixed positive function in C(a, 6). The function ir is called a weight function. 
In Example 3 we have w(t) = 1 for all t. 

example 5. In the linear space of all real polynomials, define 


(/,;?)=[ e f f(t)g(t)dt. 

** 0 

Because of the exponential factor, this improper integral converges for every choice of 
polynomials f and g. 
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theorem 15.8. In a Euclidean space V, every inner product satisfies the Cauchy- Schwarz 
inequality : 

\(x, y) | 2 < (x, x)(y, y) for all x andy in V. 

Moreover, the equality sign holds if and only if x and y are dependent. 

Proof When we proved the corresponding result for vectors in V„ (Theorem 12.3), 
we were careful to point out that the proof was a consequence of the properties of the dot 
product listed in Theorem 12.2 and did not depend on the particular definition used to 
deduce these properties. Therefore, the very same proof is valid in any real Euclidean 
space. When we apply this proof in a complex Euclidean space, we obtain the inequality 
(x, y)(y, x) < (x, x)(y, y), wfiichis the same as the Cauchy-Schwarz inequality since 

(x, y)(y, x) = (x, y)(x, y) = |(x, >0I 2 • 

example . Applying Theorem 15.8 to the space C(a, h) with the inner product (f g) = 
mm dt, we find that the Cauchy-Schwarz inequality becomes 

(|V(0g(0 dt) 2 < (/>« dt)( £ g \t) dt) . 

The inner product can be used to introduce the metric concept of length in any Euclidean 
space. 


definition. In a Euclidean space V, the nonnegative number ||x|| defined by the equation 

11*11 = (*. *) 1/2 

is called the norm of the element x. 

When the Cauchy-Schwarz inequality is expressed in terms of norms, it becomes 

l(x,y)| < ||x|| II y II , 

Since it may be possible to define an inner product in many different ways, the norm 
of an element will depend on the choice of inner product. This lack of uniqueness is to be 
expected. It is analogous to the fact that we can assign different numbers to measure the 
length of a given line segment, depending on the choice of scale or unit of measurement. 
The next theorem gives fundamental properties of norms that do not depend on the choice 
of inner product. 

THEOREM 15.9. In a Euclidean spacer every norm has the following properties for all 
elements x and y and all scalars c : 

(a) ||x|| = 0 if x = 0. 

(b) ||x|| > 0 if x^O (positivity). 

(c) ||cx || = | c | ||x|| (homogeneity). 

(d) ||x + y || < ||x|| + ||y|| (triangle inequality). 

The equality sign holds in (d) if x = 0, ify=0, or if y = exfor some c > 0. 
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Proof. Properties (a), (b) and (c) follow at once from the axioms for an inner product. 
To prove (d), we note that 

II* + /|| 2 = (x + y, x + /) = (x, x ) + (/, y) + (x, /) + ( y , x) 

= 11* II 2 + Ill'll 2 + fay) + (x,y) , 

The sum (*,/) + (*,/) is real. The Cauchy-Schwarz inequality shows that |(x, /) < 
||*|| ll/l and |(*,y)| < ||*|| ||/ 1|, so we have 

II* + /II 2 < ll*ll 2 + ll/ll 2 + 2 1|* || ll/ll = (11*11 + ll/ll) 2 , 

This proves (d). When y = cx, where c > 0, we have 

II* +/II = II* + cx|| =(1 + c)||*|| = 11*11 + II C* 11 = 11*11 + ll/ll. 

definition. In a real Euclidean space V, the angle between two nonzero elements x and 
y is defined to be that number d in the interval 0<0<tt which satisfies the equation 


( 15 . 6 ) 


cos 6 = 


(*> y) 
11*11 ll/ll 


Note: The Cauchy-Schwarz, inequality shows that the quotient on the right of (15.6) 

lies in the interval [ — 1 , 1 ], so there is exactly one 6 in [0, n\ whose cosine is equal to this 
quotient. 


15.11 Orthogonality in a Euclidean space 

definition. In a Euclidean spaceV, two elements x and y are called orthogonal if their 
inner product is zero. A subset S of V is called an orthogonal set if (x,y) = 0 for every pair 
of distinct elements x and y in S. An orthogonal set is called orthonormal if each of its 
elements has norm 1. 

The zero element is orthogonal to every element of V; it is the only element orthogonal to 
itself. The next theorem shows a. relation between orthogonality and dependence. 

theorem 15.10. In a Euclidean space V, every orthogonal set of nonzero elements is 
independent In particular, in a finite-dimensional Euclidean space with dim V = n, every 
orthogonal set consisting of n nonzero elements is a basis for V. 

Proof Let S be an orthogonal set of nonzero elements in V, and suppose some finite 
linear combination of elements of S is zero, say 

k 

1 c,*i = o , 

i = 1 

where each x i £ S. Taking the dot product of each member with x L and using the fact 
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that (.Tj , x,) = 0 if / 1, we find that c 1 (x 1 , x x ) = 0. But (x x , x,) ^ 0 since Xj ^ 0 so 

Cj = 0. Repeating the argument with x 1 replaced by Xj , we find that each c t = 0. This 
proves that S is independent. If dim V = n and if S consists of n elements, Theorem 15.7(b) 
shows that S is a basis for V, 

example. In the real linear space C(0, 2n) with the inner product (f. g) = f(x)g(x) dx, 
let S be the set of trigonometric functions { u o , u \ , U 2 , . . .} given by 

w 0 (x) = 1 , w 2n _i(x) = cos nx , u 2n (x) = sin nx , for n = 1, 2, . . . . 

If m ^ n , we have the orthogonality relations 


2 37 

j n u n (x)ujx) dx = 0 , 

so S is an orthogonal set. Since no member of S is the zero element, S is independent. The 
norm of each element of S is easily calculated. We have ( w o , u 0 ) = dx = 2 tt and, for 
n > 1, we have 

f 21 2 f 2 ”’ 2 

(n 2 „- 1 , u 2n _i) = J o cos nxdx = it, (u 2n , u 2n ) = | sin" nx dx = tt . 

Therefore, ||m 0 || = \/2it and \\u n || = x/rr for n >1. Dividing each u n by its norm, we 
obtain an orthonormal set {( p 0 , <Pi , <f 2 , . . .} where <p n = uj || U n ||. Thus, we have 


<Po( x ) = ~7= . 

"V 277 




cos nx 



<P2n(x) 


sin nx 

V'' 7 T 


for n > 1 . 


In Section 15.13 we shall prove that every finite-dimensional Euclidean space has an 
orthogonal basis. The next theorem shows how to compute the components of an element 
relative to such a basis. 


theorem 15.11. Let V be a finite- dimensional Euclidean space with dimension n,and 
assume that S = {e l , . . . , e„}is an orthogonal basis for 1 /.If an element x is expressed as 
a linear combination of the basis elements, say 


(15.7) 


x 


n 


1=1 


then its components relative to the ordered basis (e, , . . e,) are given by the formulas 


(15.8) 


_ (x, ef 
i (<7- , e 3 ) 


for j = 1,2,.., n. 


In particular, if S is an orthonormal basis, each c } is given by 


(15.9) 


Cj = (x, e 3 ) . 
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Proof. Taking the inner product of each member of (15.7) with e, , we obtain 

n 

(x, e,)= J,cfe iy e,) = cfe„ e,) 

i= 1 

since (e t , e 3 ) = 0 if / 5 ^ j. This implies (15.8), and when (e, , ef] = 1, we obtain (15.9). 

If {e, , . . . , e,} is an orthonormal basis, Equation (15.7) can be written in the form 

n 

(15.10) x = 2 (X, e^i. 

i= l 

The next theorem shows that in a finite-dimensional Euclidean space with an ortho normal 
basis the inner product of two elements can be computed in terms of their components. 


theorem 15.12. Let V be a finite-dimensional Euclidean space of dimension n, and 
assume that {e„ . . , , e n } is an orthonormal basis for V. Then for every pair of elements 
x and y in V. we have 

n 

(15.11) ( x , t) = 2 (*> e i)(y> e i) (Parseval’s formula). 

i = 1 

In particular, when x = y, we have 

(15.12) N| 2 = i|(x, ei )| 2 . 

i - 1 

Proof. Taking the inner product of both members of Equation (15.10) with y and using 
the linearity property of the inner product, we obtain (15.11). When x = y, Equation 
(15.11) reduces to (15.12). 

Note: Equation (15.11) is named in honor of M. A. Parseval (circa 1776-1836), who 

obtained this type of formula in a special function space. 


15.12 Exercises 


1. Let x = (x l , . . . , x,) and y = (y : , . . . , y,) be arbitrary vectors in V n . In each case, determine 
whether (x, y) is an inner product for V n if (x, y) is defined by the formula given. In case 
(x, y) is not an inner product, tell which axioms are not satisfied. 


(a) (x, y) W' 

i = l 


(b) O, y) 


I*#* 


(c) (x, y) 

i = 1 7 = 1 


/ n \l/2 

(d) (x, V) = ( 2 X M) 

(e) (x, y) = Z ( x i + J.) 2 ~Z x2 i ~ Zy'h 


2. Suppose we retain the first three axioms for a real inner product (symmetry, linearity, and 
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homogeneity) but replace the fourth axiom by a new axiom (4'): (x, x) = 0 if and only if 
x = 0. Prove that either (x, x) > 0 for all x 7^ 0 or else (x, x) < 0 for all x ^ 0. 

[Hint: Assume (x, x) > 0 for some x ^ 0 and ( y , y) < 0 for some y jt 0. In the 
space spanned by {x, y }, find an element z 0 with (z, z) = 0.] 

Prove that each of the statements in Exercises 3 through 7 is valid for all elements x and y in a 
real Euclidean space. 

3. (x, y) = 0 if and only if ||x + y\\ = \\x — y !!• 

4. (x, V) = 0 if and only if \\x + jll 2 = [|x|| 2 + ll/ll 2 . 

5. (x, y) = 0 if and only if jjif + cy|| > |jx|j for all real c. 

6. (x + y, x — y) = 0 if and only if |]x|| = [|j||. 

7. If x and y are nonzero elements making an angle 6 with each other, then 

\\x - |§ 2 = ||.y|j 2 + llvll 2 - 2 gx|] II v!| cos 6 . 

8. In the real linear space C(l, e), define an inner product by the equation 

ifg) = j l 0°g x)f(x)g(x) dx . 


(a) If f(x)= \'x, compute ||/|[. 

(b) Find a linear polynomial g(x) = a + bx that is orthogonal to the constant function 

J’(x) =1. 

9. In the real linear space C( — 1, 1), let if g) = ^ fiOgit) dt. Consider the three functions 
U 1 , U 2 , U 3 given by 

«i(0 = 1 > « 2 (0 = ( . « 3 W = 1 + 1 • 

Prove that two of them are orthogonal, two make an angle tt/ 3 with each other, and two 
make an angle tt/6 with each other. 

1 0. In the linear space P n of all real polynomials of degree < n, define 


(/.*) = 



(a) Prove that (fg) is an inner product for P n . 

(b) Compute (f,g) when f(t) = t and^(l) = at + b. 

(c) If f(t) = t, find all linear polynomials g orthogonal to f. 

1 1. In the linear space of all real polynomials, define (f,g) = J^° e -t f (t)g(t) dt. 

(a) Prove that this improper integral converges absolutely for all polynomialsfand g. 

(b ) If x n (t ) = t n for n = 0, 1, 2, , prove that ( x n , x m ) = (m + n)! , 

(c) Compute (f,g) when fit) = (/ + l) 2 and^-(t) = t 2 + 1. 

(d) Find all linear polynomialsg(t) = a + bt orthogonal to fit) = 1 + t. 

12. In the linear space of all real polynomials, determine whether or not ifg) is an inner product 
if if g) is defined by the formula given. In case ifg ) i s not an inner product, indicate which 
axioms are violated. In (c), f and g’ denote derivatives. 


(a) ifg) =/d) (? (l). 


(b) (/; g) = \jmt)dt 


(0 {f,g)--\]fV)gV)dt. 

(d) ifg) - (/ 0 V« dt ){ *)• 
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13. Let V consist of all infinite sequences {*,,} of real numbers for which the series x 2 converges. 
If x = {x,} and y = {/„} are two elements of V, define 


00 


(x, /) = J x n y n , 

n = 1 


(a) Prove that this series converges absolutely. 

[Hint: Use the Cauchy-Schwarz inequality to estimate the sum ! A 'nJ’n I • J 

(b) Prove that Vis a linear space with (x, y) as an inner product. 

(c) Compute (x, y) if x n = 1 /« and y n = 1 /(« + 1) for « > 1. 

(d) Compute (x, y) if x n = 2" and y n = l/«! for n > 1. 

14. Let V be the set of all real functions f continuous on [0, + oo) and such that the integral 

J 0 * e~f \t ) dt converges. Define (/, g) = e~*f{f)g{t)dt. 

(a) Prove that the integral for (f,g) converges absolutely for each pair of functions f and g 
in V. 

[Hint: Use the Cauchy-Schwarz inequality to estimate the integral e~ l f dt.] 

(b) Prove that Vis a linear space with ( f,g) as an inner product. 

(c) Compute (f,g) if f(t) = and g(t) = t n , where n = 0, 1, 2 , . . . . 

15. In a complex Euclidean space, prove that the inner product has the following properties for 
all elements x, y and z, and all complex a and b. 

(a) {ax, by) = ab(x, y). (b) (x, ay + bz) = at x, y) + b{x, z). 

16. Prove that the following identities are valid in every Euclidean space. 

(a) \\x + y li 2 = ||x|| 2 + iljl) 2 + (x, y) + {y, x). 

(b) II* + /|| 2 - II* - /II 2 = 2(x,y) + 2 (y, *). 

(c) ||* + vll 2 + ||* - /II 2 = 2 IIa'II 2 + 2 il/ll 2 . 

17. Prove that the space of all complex-valued functions continuous on an interval [a, b] becomes 
a unitary space if we define an inner product by the formula 

(/><?) = \y(t)f(t)g{t) dt , 

where w is a fixed positive function, continuous on [a, b]. 


15.13 Construction of orthogonal sets. The Gram-Schmidt process 

Every finite -dimensional linear space has a finite basis. If the space is Euclidean, we can 
always construct an orthogonal basis. This result will be deduced as a consequence of a 
general theorem whose proof shows how to construct orthogonal sets in any Euclidean 
space, finite or infinite dimensional. The construction is called the Gram-Schmidt orthog- 
onalizationprocess, in honor of J. P. Gram (1850-1916) and E. Schmidt (1845-1921). 

theorem 15.13. orthogonalization theorem. Let Xj,x 2 ,---, be a finite or infinite 
Sequence of elements in a Euclidean space V, and let L{x x , . . . , x k ) denote the subspace 
spanned by the first k of these elements. Then there is a corresponding sequence of elements 
y x , y 2 , . . ■ , in V which has the following properties for each integer k: 

(a) The element y k is orthogonal to every element in the subspace L(y x , ... , y k _ j). 

(b) The subspace spanned by y x , . . . , y% is the same as that spanned by x x , ... , x lc ! 


Hy 1 , , . • , //.-) = U(*1 , • • • , x k ) . 
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(c) The sequencey, ,y„ ... , is unique^ exceptfor scalar factors. That is, if y[, y ' , . . . , is 
another sequence of elements in V satisfying properties (a) and (b), then for each k there is a 
scalar c k such that y' k = c k y k . 

Proof. We construct the elements J'j , , . . . , by induction. To start the process, we 

take y 1 = x k . Now assume we have constructed y 1 . . . . -,y,. so that (a) and (b) are satisfied 
when k = r. Then we define y r+k by the equation 


(15.13) y r+1 = x r+i - 2 aji , 

i= 1 

where the scalars a, , . . . , a r are to be determined. For j < r, the inner product of y„ 
with yj is given by 


since (y l 


r 


(>v+i » yl) = 
yl) = 0 if i * j. 


(w+i - Tj) — 2 a fy % ) yl) - (w+i > yl) — tify , , y 3 ) > 

i= 1 

If _)’j 5 ^ 0, we can make y r+1 orthogonal to y j by taking 


(15.14) 


(W+n.Vj) 

(yj , yl) 


If = 0, then y r+1 is orthogonal to for any choice of , and in this case we choose 
a , = 0. Thus, the element y„ is well defined and is orthogonal to each of the earlier 
elements y k , . . . , y r . Therefore, it is orthogonal to every element in the subspace 


^■(jt > ■ ■ ■ > yl ) . 


This proves (a) when k = r + 1. 

To prove (b) when k = r + 1, we must show that , y r +i) = L(x k , , X r+1 ), 

given that L{)\ , . . . , y,.) = L(x k , . . , x r ). The first r elements y 1 , . . . , y r are in 

L(x r , ... ,x r ) 

and hence they are in the larger subspace L(x k , . . . , *,+i). The new element y r + 1 given by 
(15.13) is a difference of two elements in L(x k . , x r+1 ) so it, too, is in L(x k , ... , W+i)- 

This proves that 

L(yi , . . . , y r + 1 ) — L( x j > . . . , ^ r +i) . 

Equation (15.13) shows that X r+1 is the sum of two elements in Hyi , , y^fj so a similar 
argument gives the inclusion in the other direction: 

L(x x , , x r 1 1 ) c L(y\ , , y r 1 1 ) . 

This proves (b) when k = r + 1. Therefore both (a) and (b) are proved by induction on k. 
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Finally we prove (c) by induction on k. The case Ac = 1 is trivial. Therefore, assume (c) 
is true for k = r and consider the element y'^ . Because of(b), this element is in 


L(yi , • • • ,y r +i) > 

so we can write 

r+l 

3V • 1 = ^ cm = z r + c r . j j' r . j , 


where z r £ L()\ , . . . , y,.). We wish to prove that z r = 0. By property (a), both j/ +1 and 
e r+1 y r+1 are orthogonal to z r . Therefore, their difference, z r , is orthogonal to z r . In other 
words, z r is orthogonal to itself, so z r = 0. This completes the proof of the orthogonaliza- 
tion theorem. 


In the foregoing construction, suppose we have y r+1 = 0 for some r. Then (15.13) 
shows that x r+1 is a linear combination of , . . . , y r , and hence of x x , . . . , x r , so the 
elements x : , . . . , x r+1 are dependent. In other words, if the first k elements x 1 , . . . , x k 
are independent, then the corresponding elements y 4 , . . . , y, are nonzero. In this case the 
coefficients a t in (15.13) are given by (15.14), and the formulas defining }’i , . . . , }’ k become 


(15.15) y, = Xj , 


y r + 1 


x r+l 


y ( w+i , 

'h (y> > y 4 ) 


for r = 1, 2, • 


k _ 1 . 


These formulas describe the Grarn-Schmidt process for constructing an orthogonal set of 
nonzero elements ty , y k which spans the same subspace as a given independent set 

x l , . . . , x k . In particular, if x 1 , . . . , x k is a basis for a finite-dimensional Euclidean space, 
then , . . . , y k is an orthogonal basis for the same space. We can also convert this to an 
orthonormal basis by normalizing each element y„ that is, by dividing it by its norm. 
Therefore, as a corollary of Theorem 15.13 we have the following. 


theorem 15.14. Every finite-dimensional Euclidean space has an orthonormal basis. 


If x and y are elements in a Euclidean space, with y ^ 0, the element 

(y, y) 

is called the projection of x along y. In the Gram-Schmidt process (15.15), we construct 
the element y r+1 by subtracting from x r+1 the projection of x r+1 along each of the earlier 
elements y t , , y, . Figure 15.1. illustrates the construction geometrically in the vector 

space V 3 ■ 


example 1. In V 4 , find an orthonormal basis for the subspace spanned by the three 
vectors x x = (1, -1, 1, —1), x 2 = (5, 1, 1, 1), and x 3 = (-3, -3, 1, -3). 
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*3 



c = i£k tOA 

(Ti.Ti) 


figure 15.1 The Gram-Schmidt process in V 3 . An orthogonal set {y x , y 2 , v 3 } is 
constructed from a given independent set {*! , x 2 , ^3}- 


Solution. Applying the Gram-Schmidt process, we find 


y. = = (i. -i, 1, -l), 


T2 


X 2 


foa ■ Ti) 
(Ti,Ti) 


Y, = X 2 


y, = ( 4 , 2 , 0 , 2 ) . 


>’3 x 3 


(Wl) _ (£3 , Ta) 
(Ti , Ti) 1 (Ta , Ta) 


X 3 — Y, + Y2 = ( 0 , 0 , 0 , 0 ) . 


Since y 3 = O, the three vectors x x , x 2 , x 3 must be dependent. But since y, and y 2 are 
nonzero, the vectors x x and x. 2 are independent. Therefore L(x ± , X 2 , xs) is a subspace of 
dimension 2. The set {jq , y 2 } is an orthogonal basis for this subspace. Dividing each of 
and y 2 by its norm we get an orthonormal basis consisting of the two vectors 


y 1 
II Till 


= \(l -1, 1, -1) 


and — 2*- 

llTall 


-7- (2, 1, 0, 1) . 

V 6 


example 2. The Legendre polynomials. In the linear space of all polynomials, with the 
inner product (x, y) = J 1 x x(t) y(t) dt, consider the infinite sequence x 0 , Xj , x 2 , . . . , where 
X,(t) = t n . When the orthogonalization theorem is applied to this sequence it yields 
another sequence of polynomials y„ , y x , y 2 , . . . , first encountered by the French mathe- 
matician A. M. Legendre (1752-1833) in his work on potential theory. The first few 
polynomials are easily calculated by the Gram-Schmidt process. First of all, we have 

MO = x o(0 = '■ since 

(To »■ Y,) = dt= 2 and (xj , y,) = ^ dt = 0 , 

we find that 

ii(o=^(o-r~ 7/0=11(0 = '. 

(To , To) 



572 


Linear spaces 


Next, we use the relations 


0 2 > yJ t 2 dt = J-' 

to obtain 

>’ 2 (0 = x 2 (t) 


O2 , y,) 



(y , , y,) 



<*2 , To) 

: : >’o(0 - 

l.Vo , To) 


(Xj • y 1) 

(>’. , )’i) 


yi(0 = t 2 


1 

3 ' 


Similarly, we find that 


t 3 (0 = f - jt , 


t«(0 = t*- 



2 

35’ 


>’ 5 (0 = f 6 



5 

— t . 
21 


We shall encounter these polynomials again in Volume II in our further study of differential 
equations, and we shall prove that 


hit) 


n\ cT 
(2 n)\dt” 


(t 2 - l) n . 


The polynomials P. n given by 


P ' i(0 2"(n !) 2 J? " (n= 2 B n! dt* ^ 


1)" 


are known as the Legendre polynomials. The polynomials in the corresponding orthonormal 
sequence <p 0 , fi , f 2 , . . , given by <p n = yj |l_y n || are called the normalized Legendre poly- 
nomials. From the formulas for y 0 , ... , y 5 given above, we find that 





(Pit) = % 


3t) , 



(35f 4 - 30f 2 + 3) , 


(pit) = 


1 fn 

8 ^ 2 


(63 1 5 - 70f a + 151) • 


15.14 Orthogonal complements. Projections 

Let V be a Euclidean space and let S be a finite-dimensional subspace. We wish to 
consider the following type of approximation problem: Given an element x in V, to deter- 

mine an element in S whose distance from x is as small as possible. The distance between 
two elements x and y is defined to be the norm ][,T — y ||. 

Before discussing this problem in its general form, we consider a special case, illustrated 
in Figure 15.2. Here Vis the vector space V 3 and S is a two-dimensional subspace, a plane 
through the origin. Given x in V, the problem is to find, in the plane S, that point s 
nearest to x. 

If x £ S, then clearly s = x is the solution. If x is not in S, then the nearest point S 
is obtained by dropping a perpendicular from x to the plane. This simple example suggests 
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an approach to the general approximation problem and motivates the discussion that 
follows. 


definition . Let S be a subset of a Euclidean space V. An element in V is said to be 
orthogonal to S if it is orthogonal to every element of S. The set of all elements orthogonal 
to S is denoted by S and is called “ Sperpendicular. ” 

Ii is a simple exercise to verify that S 1 is a subspace of V, whether or not S itself is one. 

In case S is a subspace, then S 1 is called the orthogonal complement of S. 

example. If S is a plane through the origin, as shown in Figure 15.2, then S 1 is a line 
through the origin perpendicular to this plane. This example also gives a geometric inter- 
pretation for the next theorem. 



Figure 15.2 Geometric interpretation of the orthogonal decomposition theorem 

in V 3 . 


theorem 15.15. orthogonal decomposition theorem. Let V be a Euclidean space 
and let S be a finite-dimensional subspace ofV. Then every element x in V can be represented 
uniquely as a sum of two elements, one in S and one in S . That is, we have 

(15.16) x= .s + s 1 , where s e S and s 1 eS 1 . 

Moreover, the norm of x is given by the Pythagorean formula 

(15.17) ii.vf = mi 2 + ik 1 ii 2 . 

Proof. First we prove that an orthogonal decomposition (15.16) actually ex'Sts. Since 
S is finite-dimensional, it has a finite orthonormal basis, say {e, , . . , e,}. Given x, define 
the elements s and i 1 as follows: 


71 

5 = I (*> eje, , 
2=1 


- s . 


(15.18) 


x 
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Note that each term (x, e i )e i is the projection of x along e i . The element 5 is the sum of the 

projections of x along each basis element. Since f is a linear combination of the basis 

elements, s lies in S. The definition of S 1 shows that Equation (15.16) holds. T£> prove that 
j-L lies in S 1 , we consider the inner product of j-L and any basis element • We have 

(s 1 , e,) = (x - s, e,) = (*, e,) - (s, e,) . 

But from (15.18), we find that (s, ej) = (x, e,), so £ 1 is orthogonal to e 3 . Therefore s~ 
is orthogonal to every element in S, which means that 

Next we prove that the orthogonal decomposition (15.16) is unique. Suppose that x 
has two such representations, say 

(15.19) x = 5 -j- s l and x = t + t 1 , 

where s and 1 are in S, and jT and t 1 are in S 1 . We wish to prove that s = { and s ' 1 = 
From (15.19), we have S — t = t 1 — s L , so we need only prove that j — / = 0. But 
S — t G Sand t — 6 S 1 SO S — t is both orthogonal to t 1 — J 1 and equal to t L — jT_ 

Since the zero element is the only element orthogonal to itself, we must have 5 — / = 0. 

This shows that the decomposition is unique. 

Finally, we prove that the norm of x is given by the Pythagorean formula. We have 

||x || 2 . » P ■ - s L ,s s 1 -) . (s, s) ■ (s 1 , J 1 ) , 

the remaining terms being zero since S and j-L are orthogonal. This proves (15.17). 


definition. Let S be a finite-dimensional subspace of a Euclidean space V, and let 
{ej e,} be an orthonormal basis for S. If x E V. the element s defined by the equation 

n 

s= 2 (X, efir 

i= 1 

is called the projection of x on the subspace S. 


We prove next that the projection of x on S is the solution to the approximation problem 
stated at the beginning of this section. 


15.15 Best approximation of elements in a Euclidean space by elements in a finite- 
dimensional subspace 

theorem 15.16. approximation theorem. Let S be a finite-dimensional subspace of 
a Euclidean space V, and let x be any element ofV. Then the projection ofx on Sis nearer to 
x than any other element of S. That is, ifs is the projection ofx on S, we have 

II* jII < II* - ill 

for all t in S; the equality sign holds if and only if t = s. 
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Proof, By Theorem 15.15 we can write x = s + where $ £ S and s 1 e S^, Then, 
for any t in S, we have 

x — t = (x — s) + (s — t) . 

Since s — t E S and x - j = s 1 £ S 1 , this is an orthogonal decomposition of x — so 
its norm is given by the Pythagorean formula 

\\x-t\\>= ||JC-S||*+ ll^-tll 2 . 

But ||$ — ?|1 2 > 0, SO we have ||x t| 2 > j|x $|| 2 , with equality holding if and only if 

S = t. This completes the proof. 

example 1. Approximation of continuous junctions on [0, 2tt] by trigonometric polyno- 
mials. Let V = C(0, 27t), the linear space of all real functions continuous on the interval 
[0, 2tt], and define an inner product by the equation ( fg)= fl” f(x)g(x) dx. In Section 15.11 
we exhibited an orthonormal set of trigonometric functions <p 0 , cp 1 , <p 2 , . . . , where 

(15.20) <p 0 ( X ) = -)= , n ux) = ~7=^ , fcW = * f ' 0 r k > 1 • 

\2<IT V 77 V 77 

The 2n + 1 elements ^9 0 , <p l , . . . , p 2 „ span a subspace S of dimension 2n + 1. The elements 
of S are called trigonometric polynomials. 

If fe C(0, 2tt), let f, denote the projection off on the subspace S. Then we have 


(15.21) f„ = 2(/> <Pk)<Pk > where (f, <Pk) = jl*f(x)<P k (x) dx . 

*= o u 

The numbers (f, <f k ) are called Fourier coefficients off. Using the formulas in (15.20), we 
can rewrite (15.21) in the form 


n 

(15.22) f,(x) = |a 0 -f £( a k cos kx + b k sin kx) , 

i 

where 

1 f 2,r 1 P* 

a k = — f{x) cos kx dx , b k = — f(x) sin kx dx 

77 Jo 77 J( 

for k — 0, 1, 2 , , , . , n. The approximation theorem tells us that the trigonometric poly- 
nomial in (15.22) approximates f better than any other trigonometric polynomial in S, 
in the sense that the norm [| f — y^.jiis as small as possible. 


example 2. Approximation of continuous functions on [- 1,1] by polynomials of 

degree < n. Let V = C( —1,1), the space of real continuous functions on [ — 1,1], and let 
{f, g) = Jij f(x)g(x) dx. The n + 1 normalized Legendre polynomials (p Q , <p x , ... , 
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introduced in Section 15.13, span a subspace S of dimension n + 1 consisting of all poly- 
nomials of degree < n. If f e C( — 1, 1), let f n denote the projection off on S. Then we 
have 

n n 

fn = 2(/, t where (/> fk) = ,/(0^(0 dt . 

k=0 J-l 

This is the polynomial of degree < n for which the norm \\f — f n [| is smallest. For example, 
when f(x) = sin ttx, the coefficients ( 'f (p k i) are given by 


(/, <p k )= J sin 77 1 cp k (t) dt . 
In particular, we have (f <p 0 ) = 0 and 

/ 7 

t sin rrt dt ■■ 




3 2 

2 77 


Therefore the linear polynomial f k (t) which is nearest to sin nt on [-1, 1] is 

h 2 




Since (/, <p 2 ) = o, this is also the nearest quadratic approximation. 


15.16 Exercises 

1. In each case, find an orthonormal basis for the subspace of V 3 spanned by the given vectors. 

(a) x 1= ( 1 , 1 , 1 ), x 2 = ( 1 , 0 , 1 ), * 8 =(3,2,3). 

(b) ^= ( 1 , 1 , 1 ), x 2 = (- 1 , 1 , - 1 ), X 3 = ( 1 , 0 , 1 ). 

2. In each case, find an orthonormal basis for the subspace of V i spanned by the given vectors. 

(a) ^ = (1 1, 0, 0), x 2 = (0, 1, 1, 0), -v a = (0, 0, 1, 1), x 4 = (1, 0, 0, 1). 

(b) *! = ( 1 , 1 , 0 , 1 ), x 2 = ( 1 , 0 , 2 , 1 ), *3 = ( 1 , 2 , - 2 , 1 ). 

3. In the real linear space C(0, 77 ), with inner product (x, y) = J j, x{t)y{ t) dt, let x h (t) = cos nt 
forn = 0, 1, 2 , , . , . Prove that the functions y, , Vj , y, , . . . , given by 

>' o(r)= 7 ^ and yff for n > \ ■ 

form an orthonormal set spanning the same subspace as .r f] , x 1 , x 2 , , . . . 

4. In the linear space of all real polynomials, with inner product (x, y) = jj x(t)y(t) dt, let 
x n (t) = t” for n = 0, 1, 2, . . . . Prove that the functions 

To« = 1 , yft) = V3 {It - 1) , yft) = V5 (6r 2 -6t +1) 
form an orthonormal set spanning the same subspace as {x 0 ,x 1 , x 2 }. 
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5. Let V be the linear space of all real functions f Continuous on [0, + oc) and such that the 
integral j" e~*p{t) dt converges. Define (/, g) = J’o° e~ l f(i)g{t) dt, and let y„ , jy , y, , . . . , be 
the set obtained by applying the Gram-Schmidt process to Xq , X 1 , x 2 , . . . , where x,(t) = t n 
forn > 0. Prove that j 0 (f) = 1, yp) = / - 1, y 2 (r) = / 2 - 4t + 2, y 3 (t) = ? 3 - 9t 2 + 18/ - 6. 

6. In the real linear space C(l, 3) with inner product (f,g) = j? f{x)g{x) cLx, let f(x) = Ijx 
and show that the constant polynomial g nearest to/is g = |log 3. Compute ||^ -f jj 2 for 
this g. 

7. In the real linear space C(0, 2) with inner product (f,g) = f(; f(x)g(x) dx, let f(x) = e x and 
show that the constant polynomial g nearest to fis g = l(e 2 — 1). Compute jj g -f || 2 for 
this g. 

8. In the real linear space C( -1, 1) with inner product ( fg)= f(x)g(x) dx, let f(x) = e x 

and find the linear polynomialg nearest to f. Compute -f || 2 for thisg. 

9. In the real linear space C(0, 277) with inner product (f, g) = J 2s f(x)g{x) dx, let f(x) = x. 
In the subspace spanned by U^x) = 1, upx) = cos x, i/ 2 (x) = sin x, find the trigonometric 
polynomial nearest to /. 

10. In the linear space Vof Exercise 5, let/(jc) = e~ x and find the linear polynomial that is nearest 
to/. 
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LINEAR TRANSFORMATIONS AND MATRICES 


16.1 Linear transformations 

One of the ultimate goals of analysis is a comprehensive study of functions whose 
domains and ranges are subsets of linear spaces. Such functions are called transformations , 
mappings, or operators. This chapter treats the simplest examples, called linear trans- 
formations, which occur in all branches of mathematics. Properties of more general 
transformations are often obtained by approximating them by linear transformations. 

First we introduce some notation and terminology concerning arbitrary functions. Let 
V and VFbe two sets. The symbol 

T : V — »■ W 

will be used to indicate that T is a function whose domain is V and whose values are in W. 
For each x in V, the element T(x) in W is called the image of x under T, and we say that T 
maps xotlto T(x). If A is any subset of V, the set of all images T(x) for x in A is called the 
image of A under T and is denoted by T(A ). The image of the domain V, T(V), is the range 
of T. 

Now we assume that V and W are linear spaces having the same set of scalars, and we 
define a linear transformation as follows. 

definition . If V and W are linear spaces, a function T: V W is called a linear trans- 
formation ofV into W if it has the following two properties: 

(a) T(x + y) = T(x) + T(v) for all x and y in V, 

(b) T(cx) = cT(x) for all x in V and all scalars c. 

These properties are verbalized by saying that T preserves addition and multiplication by 
scalars. The two properties can be combined into one formula which states that 

T(ax + by) = aT(x) + bT(y ) 

for all x, y in V and all scalars a and b. By induction, we also have the more general 
relation 

T (f, a i x i) ='la i T(x i ) 

\i= 1 / i=l 

for any n elements x l , . . . , x n in V and any n scalars a, , , a, . 

518 
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The reader can easily verify that the following examples are linear transformations. 

example 1 . The identity transformation. The transformation T : V — > V, where T(x) = x 
for each x in V, is called the identity transformation and is denoted by Z or by / v , 

EXAMPLE 2. The zero transformation. The transformation T: V-f V which maps each 
element of V onto 0 is called the zero transformation and is denoted by 0. 

example 3. Multiplication by a fixed scalar c. Here we have T: V - > V, where T(x) = cx 
for all x in V. When c = 1, this is the identity transformation. When c = 0, it is the zero 
transformation. 

example 4. Linear equations. Let V = V n and W ~ V m . Given nm real numbers a ik , 
where i — 1, 2 , . . , , m and k = 1,2 define T : V n -* V m as follows: Tmapseach 
vector x = (x^ , . . . , x,) in V„ onto the vector y = ( Vi , . . . ,y,) in V m according to the 
equations 

n 

y,= 2 a i* x « for i = 1, 2, . . . , m . 

k=l 

example 5. Inner product with a fixed element. Let V be a real Euclidean space. For a 
fixed element z in V, define T : V —* R as follows: If x £ V, then T(x) = (x, z), the inner 
product of x with z. 

example 6. Projection on a subspace. Let V be a Euclidean space and let S be a finite- 
dimensional subspace of V. Define T : V — > S as follows: If x £ V. then T(x) is the 
projection of x on S. 

example 7. The differentiation operator. Let V be the linear space of all real functions 
/ differentiable on an open interval (a, b). The linear transformation which maps each 
functionfin V onto its derivativef’ is called the differentiation operator and is denoted by 
D. Thus, we have D : V — >■ W, where D(f)= f for each f in V. The space W consists of 
all derivative sf’. 

example 8. The integration operator. Let V be the linear space of all real functions 
continuous on an interval [a, b], If/e V, define g = T(f) to be that function in V given by 


g(x ) = f /(f) dt if a < x < b . 

This transformation T is called the integration operator. 

16.2 Null space and range 

In this section, T denotes a linear transformation of a linear space V into a linear space W. 

theorem 16.1. The set T(V) (the range of T) is a subspace of W. Moreover, T maps 
the zero element of V onto the zero element of W. 
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Proof. To prove that T(V) is a subspace of W, we need only verify the closure axioms. 
Take any two elements of T(V), say T(.x) and T(y). Then T(x) + T(y) = T(x + y), so 
T(x) + T(y) is in T( V). Also, for any scalar c we have cT(x) = T(cx), SO cT(x) is in T(V). 
Therefore, T(V) is a subspace of W. Taking c = 0 in the relation T(cx) = cT(x), we find 
that T(0) = 0. 


definition. The set of all elements in V that T maps onto 0 is called the null space of 
T and is denoted by N(T). Thus, we have 

N(T) = {X x e V and T(x) = 0} . 

The null space is sometimes called the kernel of T. 

THEOREM 16.2. The null space of T is a subspace of V. 

Proof. If x and y are in N(T), then so are x + y and cx for all scalars c, since 
T(x + y) = T(x) + T(y) = 0 and T(cx) = cT(x ) = 0. 

The following examples describe the null spaces of the linear transformations given in 
Section 16.1. 

example 1. Identity transformation. The null space is {0}, the subspace consisting of 
the zero element alone. 

EXAMPLE 2. Zero transformation. Since every element of V is mapped onto zero, the 
null space is V itself. 

example 3. Multiplication by a fixed scalar c. If c /0, the null space contains only 0. 

If c = 0, the null space is V. 

example 4. Linear equations. The null space consists of all vectors , . . . , x,) in V n 
for which 

n 

J,a ik x k = 0 for i = 1,2, .... m ■ 

*-=i 

example 5. Inner product with ajixed element z. The null space consists of all elements 
in V orthogonal to z. 

example 6. Projection on a subspace S. If x £ V, we have the unique orthogonal 
decomposition x = s + j-L (by Theorem 15.15). Since T(x) = s, we have T(x) = 0 
if and only if x = s' . Therefore, the null space is S-, the orthogonal complement of S. 

example 7. Differentiation operator. The null space consists of all functions that are 
constant on the given interval. 

example 8. Integration operator. The null space contains only the zero function. 
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16.3 Nullity and rank 

Again in this section T denotes a linear transformation of a linear space V into a linear 
space W, We are interested in the relation between the dimensionality of V, of the null 
space N(T), and of the range T(V). If V is finite-dimensional, then the null space is also 
finite-dimensional since it is a subspace of V, The dimension of N(T) is called the nullity 
of T. In the next theorem, we prove that the range is also finite-dimensional; its dimension 
is called the rank of T. 

theorem 16.3. nullity plus rank theorem. If V is finite-dimensional , then T(V) 
is also jnite-dimensional, and we have 

(16.1) dim N(T) + dim T(V) = dim V . 

In other words , the nullity plus the rank of a linear transformation is equal to the dimension 
of its domain. 

Proof. Let n =dim V and let q , . . . , be a basis for N(T), where k = dim N(T) < n. 
By Theorem 15.7, these elements are part of some basis for V, say the basis 

(16.2) e, 

i , e k i e k + 1 5 • • ■ I e k+r > 

where k + r = n. We shall prove that the r elements 
(16-3) . . . j T(e k+ f) 

form a basis for T(V), thus proving that dim T(V) = r. Since k + r = n, this also proves 
(16.1). 

First we show that the r elements in (16.3) span 7YV). If y £ T(V), we have y = T(x) 
for some x in V, and we can write x = qq + 1 . . + c k+r e k+r . Hence, we have 

k+r k k+r k+r 

y = T(x) = Zc.ne,) = 2 Ci T( ei ) + 2 Ci T( ei ) = 2 c iT(e t ) 

i~ i i=i 1 

since T(e,) = • > > = T(e k ) = 0. This shows that the elements in (16.3) span T(V). 

Now we show that these elements are independent. Suppose that there are scalars 

c k +i , . . . , c k+r such that 

fc+r 

2 c t T(e t ) = O. 

i=k + 1 

This implies that 



so the element x = c k+1 e k+1 + ' ' 1 + c k+r e k+r is in the null space N(T). This means there 
are scalars q , . . . , c k such that x = c^i + . . . + c k e k , so we have 

t k+r 

X - X = 2 c i e i - z c i e i = 0 ' 

2=1 
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But since the elements in (16.2) are independent, this implies that all the scalars c i are zero. 
Therefore, the elements in (16.3) are independent. 

Note: If V is infinite-dimensional, then at least one of N(T) or T(V) is infinite- 
dimensional. A proof of this fact is outlined in Exercise 30 of Section 16.4. 


16.4 Exercises 


In each of Exercises 1 through 10, a transformation T: V 2 -*■ V 2 is defined hy the formula given 
for T(x, y), where (a, y) is an arbitrary point in V 2 . In each case determine whether T is linear. If 
T is linear, describe its null space and range, and compute its nullity and rank. 


1. T(x, y ) = O', x). 

2. T(x, y) = (x, -y). 

3. T(x,y) = (x, 0). 

4. T(x, y) = (a-, a). 

5. T(x,y) = (a 2 , /). 


6. T(x,y) = (e», e IJ ). 

7. T(x,y) = (a, 1). 

8. T(x,y) = (a + 1, y + 1). 

9. T(x,y) = (a -y,x +y). 

10. T{ A, y) = (2a -y, X + y). 


Do the same as above for each of Exercises 11 through 15 if the transformation T: V 2 -+V 2 
is described as indicated. 

1 1. T rotates every point through the Same angle <j> about the origin. That is, T maps a point 
with polar coordinates (r, 0) onto the point with polar coordinates (r, Q + <j>) 1 where <f) is 
fixed. Also, Tmaps 0 onto itself. 

12. T maps each point onto its reflection with respect to a fixed line through the origin. 

13. T maps every point onto the point (1, 1). 

14. T maps each point with polar coordinates (r, 0)ontO the point with polar coordinates (2 r, 6). 
Also, T maps 0 onto itself. 

15. T maps each point with polar coordinates (r, 0)ontO the point with polar coordinates (r, 20). 
Also, T maps 0 onto itself. 


Do the same as above in each of Exercises 16 through 23 if a transformation T: V 2 -*■ V s is 
defined by the formula given for T(x, y, z), where (x, y, z) is an arbitrary point of V 3 . 


16. T(x, y, z) = (z, y, a). 

17 . T( a, y , z) = (a, y, 0). 

18. T(x, y, z) = (x, 2 y, 3 z). 

19. T(x, y, z)= (a, y, 1). 


20. T(x,y, z) = (x + 1, y + 1, z 1). 

21. T(x,y, z) = (a + t„y + 2, z + 3). 

22. T(x,y, z) = (a, f, z 3 ). 

23. T(x, y, z) = (x + z, 0, x + y). 


In each of Exercises 24 through 27, a transformation T: V -* V is described as indicated. In 


each case, determine whether T is linear. If T is linear, describe its null space and range, and 
compute the nullity and rank when they are finite. 

24. Let V be the linear space of all real polynomials p(x) of degree < n. If p G K y = T(p) means 
that y(x) = p(x + 1) for all real A. 

25. Let V be the linear space of all real functions differentiable on the open interval (—1, 1). 
If / 6 K 8 - T (f) means that g(x) = xf'(x) for all X in ( - 1, 1). 

26. Let y be the linear space of all real functions continuous on [a, b]. If /"£ V. g = T(f) means 


that 


g(x) = J f(t) sin (x — t) dt for a < x < b , 

27. Let V be the space of all real functions twice differentiable on an open interval (a, b). If 
yeh define T(y) = y’ - + Py’ + Qy, where P and Q are fixed constants. 

28. Let V be the linear space of Elll real convergent sequences { x, } . Define a transformation 
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T: y V as follows: If x = { x, } is a convergent sequence with limit a, let T(x) = 
where y n = a — x n for n > 1 . Prove that Tis linear and describe the null space and range of T. 
29. Let V denote the linear space of all real functions continuous on the interval [ — it, -]. 
Let S be that subset of V consisting of all / satisfying the three equations 

f fit) dt = 0, f /(/) cos tdt = 0 , l” f(t) sin t dt = 0 . 

J ft J 7T J • — It 


(a) Prove that S is a subspace of V. 

(b) Prove that S contains the functions/'O) = cos nx and f(x) = sin nx for each n = 2, 3, ... . 

(c) Prove that S is infinite-dimensional. 

Let TV V - > Lbe the linear transformation defined as follows: If/ 6 V, g = T(f) means that 
g(x ) = {1 + cos (x-t)}f(t)dt . 

(d) Prove that T{ V), the range of T, is finite-dimensional and find a basis for T(V). 

(e) Determine the null space of T. 

(f) Find all real c ^ 0 and all nonzero / in V such that T(f) = cf. (Note that such an j 
lies in the range of T.) 

30. Let T: V If be a linear transformation of a linear space V into a linear space fp If Vis 
infinite-dimensional, prove that at least one of T(V) or N(T) is infinite -dimensional. 

[Hint: Assume dim N(T) = k, dim 7YVj = r, let e 1 , . . . , e k be a basis for N(T) 
and let , . . . , e H , e k+1 , . . . , e k+n be independent elements in V, where n > r. The 
elements T(e k+1 ),. . . , T(e.+,) are dependent since n> r. Use this fact to obtain a 
contradiction.] 


16.5 Algebraic operations on linear transformations 

Functions whose values lie in a given linear space IF can be added to each other and can 
be multiplied by the scalars in W according to the following definition. 


definition. Let S:V—> W and T: V— > W be two functions with a contmon domain V 
and with values in a linear space W. If c is any scalar in W, we define the sum S + Tand the 
product cT by the equations 

(16.4) (S + T)(x) = S(x) + T(x), (cT)(x) = cT(x) 

for all Xin V. 

We are especially interested in the case where V is also a linear space having the same 
scalars as W. In this case we denote by V, W) the set of all linear transformations of V 
into W. 

If S and Tare two linear transformations in ff ? ( V, W), it is an easy exercise to verify that 
S + T and cT are also linear transformations in f£( V. W). More than this is true. With the 
operations just defined, the set V, W) itself becomes a new linear space. The zero 
transformation serves as the zero element of this space, and the transformation (- 1) T 
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is the negative of T. It is a straightforward matter to verify that all ten axioms for a linear 
space are satisfied. Therefore, we have the following. 

theorem 16.4. The set £P(V, W) of all linear transformations of V into W is a linear 
space with the operations of addition and multiplication by scalars dejned as in (16.4). 

A more interesting algebraic operation on linear transformations is composition or 
multiplication of transformations. This operation makes no use of the algebraic structure 
of a linear space and can be defined quite generally as follows. 



Figure 16.1 Illustrating the composition of two transformations. 


definition. Let u, V, W be sets. Let T: U —> V be a function with domain U and 
values in V, and let S ; V-f W be another function with domain V and values in W. Then 
the composition ST is the function ST \ U — »■ W dejned by the equation 

(ST)(x) = S[T(x)] for every x in U . 

Thus, to map x by the composition ST, we first map x by T and then map T(x) by S. 
This is illustrated in Figure 16.1. 

Composition of real-valued functions has been encountered repeatedly in our study of 
calculus, and we have seen that the operation is, in general, not commutative. However, 
as in the case of real-valued functions, composition does satisfy an associative law. 

theorem 16.5. If T: (J —>V,S: V — > W, and R ; W — > X are three functions, then we have 

R(ST) = (RS)T. 

Proof. Both functions R(ST) and (RS)T have domain U and values in X. F° r each x 
in U, we have 

[R(ST)](x) = R[(ST)(x)} = R[S[T(x)]] and [(RS)T](x) = (RS)[T(x)] = R[S[T(x)]], 
which proves that R(ST) = (RS)T. 
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definition. Let T: V — > V be a function which maps V into itself. We define integral 
powers ofT inductively as follows'. 

T° = z, T" = 7T"- 1 for n> 1 . 

Here Z is the identity transformation. The reader may verify that the associative law 
implies the law of exponents T m T r = 7’™+" for all nonnegative integers m and n. 

The next theorem shows that the composition of linear transformations is again linear. 

theorem 16.6. If U, V, W are linear spaces with the same scalars, and if T: V -> V 
and S : V — > W are linear transformations, then the composition ST \ U — ► W is linear. 

Proof. For all x, y in U and all scalars a and b, we have 

(ST)(ax + by) = S[T(ax + by)] = S[aT(x ) + bT(y)] = aST{x) + bST(y) . 

Composition can be combined with the algebraic operations of addition and multiplica- 
tion of scalars in P£(V, W) to give us the following. 


theorem 16.7. Let U, V, W be linear spaces with the same scalars, assume S and T 
are in f£{V, W), and let c be any scalar. 

(a) For any function R with values in V, we have 

(S + T)R = SR + TR and ( cS)R= c(SR) , 

(b) For any linear transformation R \W — > U, we have 

R(S + T) = RS + RT and R(cS ) = c(RS ) ■ 

The proof is a straightforward application of the definition of composition and is left as 
an exercise. 

16.6 Inverses 

In our study of real-valued functions we learned how to construct new functions by 
inversion of monotonic functions. Now we wish to extend the process of inversion to a 
more general class of functions. 

Given a function T, our goal is to find, if possible, another function S whose composition 
with T is the identity transformation. Since composition is in general not commutative, 
we have to distinguish between ST and TS. Therefore we introduce two kinds of inverses 
which we call left and right inverses. 


definition. Given two sets V and W and a function T: V-fW. A function S: T(V) — > V 
is called a left inverse of T if 5[T(jc)] = x for all x in V, that is, if 

S T = I v , 
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where Iyis the identity transformation on V. A function R ; T(V) — ► V is called a right inverse 
ofT if T[/?(y)] = y for all y in T(V), that is, if 

TR — I T ( V ) , 

where I^tr) ^ ^ ie identity transformation on T(V). 

example . A function with no left inverse but with tWO right inverses. Let V = { 1, 2) 
and let W = {0}. Define T : W as follows: T( 1) = T(2 ) = 0. This function has 

two right inverses R : W — > V and R’ : W~*V given by 

R(0) = f , R’(0) = 2 . 

It cannot have a left inverse S since this would require 

1 = S[T(l)l = S(O) and 2 = S[T(2)} = S(0). 

This simple example shows that left inverses need not exist and that right inverses need not 
be unique. 

Every function T : V — > W has at least one right inverse. In fact, each y in T(V) has the 
form y = T(x) for at least one x in V. If we select one such x and define R(y) = x, then 
T[R(y)] = T(x) = y for each y in T(V), so R is a right inverse. Nonuniqueness may occur 
because there may be more than one x in V which maps onto a given y in T(V). We shall 
prove presently (in Theorem 16.9) that if each y in T( V) is the image of exactly one x in V, 
then right inverses are unique. 

First we prove that if a left inverse exists it is unique and, at the same time, is a right 
inverse. 

THEOREM 16.8. A function T: V— > W can have at most one left inverse. If T has a left 
inverse S, then S is also a right inverse. 

Proof. Assume T has two left inverses, S: T(V) — > V and S’: T(V) — » V. Choose any 
y in T(V). We shall prove that S(y) = S’(y). Now y = T(x) for some x in V, SO we have 

S[T(x)] = x and 5'[7’(x)] = x, 

since both S and S’ are left inverses. Therefore S(y) = x and S’ (y) = x, so S(y) = S’(y) 
for all y in T(V). Therefore S = S’ which proves that left inverses are unique. 

Now we prove that every left inverse S is also a right inverse. Choose any element y in 
T(V). We shall prove that = y. Since y 6 T(V), we have y = T(x) for some x in 

V. But S is a left inverse, so 

x = S[T(x)] = S(y). 

Applying T, we get T(x) = T[S(y)]. But y = T(x), so y = which completes the 

proof. 

The next theorem characterizes all functions having left inverses. 
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theorem 16.9. A finction T: V — *■ W has a left inverse if and only if T maps distinct 

elements of V onto distinct elements of W; that is, if and only if for all x and y in V , 

(16.5) x^y implies T(x) ^ T(y) . 

Note: Condition (16.5) is equivalent to the statement 

(16.6) T(x) = T(y) implies x =y. 

A function T satisfying (16.5) or (16.6) for all x and y in Vis said to be one-to-one on V. 

Proof Assume T has a left inverse S, and assume that T(x) = T( y). We wish to prove 
that x = y. Applying S, we find S[T(x)] = S[T(y)]. Since S[T(x)] = x and S[T(y)] = y, 
this implies x = y. This proves that a function with a left inverse is one-to-one on its 
domain. 

Now we prove the converse. Assume T is one-to-one on V. We shall exhibit a function 
S: T(V) — » V which is a left inverse of T. If y £ T(V), then y = T(x) for some x in V. By 

(16.6) , there is exactly one x in V for which y = T(x). Define S(y) to be this x. That is, 
we define S on T(V) as follows: 

V(_>0 = x means that T(x) = y . 

Then we have S[T(x)] = x for each x in V ; so ST = I y . Therefore, the function S so 
defined is a left inverse of T. 


definition. Let T: V — > W be one-to-one on V. The unique left inverse of T (which 
we know is also a right inverse) is denoted by 7’ -1 . We say that T is invertible, and we call 
T~ l the inverse of T. 


The results of this section refer to arbitrary functions. Now we apply these ideas to 
linear transformations. 


16.7 One-to-one linear transformations 

In this section, V and W denote linear spaces with the same scalars, and T: V W 
denotes a linear transformation in ff(V, W). The linearity of T enables us to express the 
one-to-one property in several equivalent forms. 

theorem 16.10. Let T: V —> W be a linear transformation in =£?( V, W). Then the 
following statements are equivalent. 

(a) T is one-to-one on V. 

(b) T is invertible and its inverse T~ 1 '- T(V) — > V is linear. 

(c) For all x in V, T (x ) = 0 implies x = 0. That is, the null space N(T) contains only 
the zero element of V. 
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Proof. We shall prove that (a) implies (b), (b) implies (c), and (c) implies (a). First 
assume (a) holds. Then T has an inverse (by Theorem 16.9), and we must show that T -1 
is linear. Take any two elements u and v in T(V). Then u - T(x) and v = T(y) for some 
x and y in V. For any scalars a and b, we have 

an + bv = aT(x ) + bT(y) = T(ax + by), 
since T is linear. Hence, applying T~ x , we have 

T~\au + bv) = ax + by = aT~\ii) + bT 1 (v ) , 
so T~ l is linear. Therefore (a) implies (b). 

Next assume that (b) holds. Take any x in V for which T(x) = 0. Applying T~\ we 
find that x = T~\0) = 0, since T~ l is linear. Therefore, (b) implies (c). 

Finally, assume (c) holds. Take any two elements u and v in V with T(u) = T(v). By 
linearity, we have T(u ™ v) = T(u ) — T(v) = 0, so u — v = 0. Therefore, Tis one-to-one 
on V, and the proof of the theorem is complete. 

When V is finite-dimensional, the one-to-one property can be formulated in terms of 
independence and dimensionality, as indicated by the next theorem. 

theorem 16.11. Let T: V —>W be a linear transformation in JP( V, W) and assume that 
V is finite-dimensional, say dim V = n. Then the following statements are equivalent. 

(a) T is one-to-one on V. 

(b ) //«!,.. . e p are independent elements in V, then T(e,) T(e,) are independent 

elements in T(V). 

(c) dim T(V) = n. 

(d) //to,..., e,j is a basis for V, then fT(e,), . . . , T(e,)j is a basis for T(V). 

Proof. We shall prove that (a) implies (b), (b) implies (c), (c) implies (d), and (d) implies 
(a). Assume (a) holds. Let e x , . . . , e v be independent elements of V and consider the 
elements T(e,), . . . , T(e,) in T(V). Suppose that 


| c t T(e t ) = 0 

for certain scalars q , . . . , cr . By linearity, we obtain 

( P \ 3 ) 

2 c,e, 1 = 0, and hence J c i e t ~ 0 

i= i / <= 1 

since T is one-to-one. But e x , . . . , e t are independent, SO q = 1 ' ' = C v = 0. Therefore 
(a) implies (b). 

Now assume (b) holds. Let {e„ . . . , e,} be a basis for V. By (b), the n elements 
T(e i), . . . ; T(e,) in T(V) are independent. Therefore, dim T(V) > n. But, by Theorem 
16.3, we have dim T(V) < n. Therefore dim T(V) = n, so (b) implies (c). 
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Next, assume (c) holds and let {e, , . . . , e,} be a basis for V. Take any element y in 
T(V). Then y = T(x) for some x in V. so we have 


n n 

x = 2 c a > and hence y = T W = I c i T ( e i) ■ 

i = 1 i = 1 

Therefore {T(e L ), . . . , T(e,)} spans T(V). But we are assuming dim T(V) = n, so 
. . . , T(e,)} is a basis for T(V). Therefore (c) implies (d). 

Finally, assume (d) holds. We will prove that T(x ) = 0 implies X = 0. Let {e, , . . . , C n ] 
be a basis for V. If x e V, we may write 


n n 

x = 2 i’ and hence T(x) = 2 
i=l i=l 

If T(x) = 0, then c x = • ■ • = c„ = 0, since the elements T(e,), . . . , T(e n ) are independent. 
Therefore x = 0, so T is one-to-one on V. Thus, (d) implies (a) and the proof is complete. 


16.8 Exercises 


1. Let V = {0, 1}. Describe all functions T: V -> V, There are four altogether. Label them as 

T\ . T 2 , T, , and make a multiplication table showing the composition of each pair. Indicate 

which functions are one-to-one on V and give their inverses. 

2. Let V = ( 0 , 1,2}. Describe all functions 7 : V-> V for which T(V) = V, There are six 
altogether. Label them as T x , ■ ■ ■ , T ft and make a multiplication table showing the com- 
position of each pair. Indicate which functions are one-to-one on V, and give their inverses. 

In each of Exercises 3 through 12, a function T: V 2 -*■ V 2 is defined by the formula given for 
T(x, y), where (x, y) is an arbitrary point in V 2 . In each case determine whether T is one-to-one 
on V 2 . If it is, describe its range T(V 2 ); for each point (u, v) in T( V 2 ), let (x,y) = T~\u, v ) and 
give formulas for determining x and y in terms of u and v. 


3. T(x,y) = (y, x). 

4. T(x, y) = {x, -y>. 

5. T(x, y) = (x, 0). 

6. T(x, y) = (x, x). 

7. T(x. y) = (x 2 , /). 


8. T{x,y)= (e x , e’ J ). 

9. T(x, y) = (x, 1). 

10. T(x,y) = (x + l,y + 1). 

11. T{x,y) = {x - y, x +y). 

12. T{x, y) = (2x - y, x + y). 


In each of Exercises 13 through 20. a function T: V 3 ->• V 3 is defined by the formula given for 
T(x, y, z), where (x, y, z) is an arbitrary point in V 3 . In each case, determine whether Tis one-to- 
one on V,. If it is, describe its range 77 V,); for each point ( u , v, w) in 77 L 3 ), let (x. y, z) = 
T~\u , v, w) and give formulas for determining x, y, and z in terms of u, v, and w. 


13. T(x, y, z) = (z, y, x). 17. T(x, y,z)-(x + l,y+l,z - 1). 

14. T(x,y, z) = (x, y, 0). 18. T(x,y, z) - (x + l,y + 2, z + 3). 

15. T(x,y, z) = (x, 2 y, 3z). 19. T(x,y,z) =(x,x + y,x +y + z). 

16. T(x , y, z) = (x, y, x + y + z). 20. T(x, y, z) = ( x + y, y + z, x + z). 

21. Let T: V-* Vbe a function which maps V into itself. Powers are defined inductively by the 
formulas T v = Z, T n = 7Y ri ~ l for n > 1. Prove that the associative law for composition 
implies the law of exponents: T m T n = T m+n . If Tis invertible, prove that T" is also invertible 
and that (T n y~ 1 = (T' 1 )”. 
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In Exercises 22 through 25, S and T denote functions with domain V and values in V. In 
general, ST ^ TS. If ST = TS, we say that S and T commute. 

22. If S and T commute, prove that (STf = S n T n for all integers n > 0. 

23. If S and Tare invertible, prove that ST is also invertible and that (STf 1 = T -] S -1 . In other 
words, the inverse of ST is the composition of inverses, taken in reverse order. 

24. If S and Tare invertible and commute, prove that their inverses also commute. 

25. Let Y be a linear space. If S and T commute, prove that 

(S + Tf = S 2 + 2ST + T 2 and (S + Tf = S 3 + 3 S 2 T + 3 ST 2 + T 3 . 

Indicate how these formulas must be altered if ST jt TS. 

26. Let S and Tbe the linear transformations of V 3 into V :j defined by the formulas S(x, y, *> = 
(z, y, x) and T(x, y, z) = (x, x + y, x + y + z), where (x, y, z) is an arbitrary point of V 3 . 

(a) Determine the image of (x, y, z) under each of the following transformations: ST, TS, 
ST - TS, S‘\ T 2 , {STf, ( TSf , (ST - TS) 2 . 

(b) Prove that S and Tare one-to-one on Y, and find the image of (u, v, w) under each of the 
following transformations : S' -1 , T -1 , (ST) -1 , (TS) -1 . 

(c) Find the image of (x, y, z) under (T — Tf for each n> 1. 

27. Let V be the linear space of all real polynomialsp(x). Let D denote the differentiation operator 
and let T denote the integration operator which maps each polynomialp onto the polynomial 
q given by q(x) = j - * p(t)dt. Prove that DT = Z but that TD ?£ Z. Describe the null space 
and range of TD. 

28. Let Kbe the linear space of all real polynomialsp(x). Let D denote the differentiation operator 
and let T be the linear transformation that mapsp(x) onto •*/>'(*)■ 

(a) Let p(x) = 2 + 3x — x 2 + 4x 3 and determine the image ofp under each of the following 
transformations: D, T, DT, TD, DT - TD, J 2 D 2 - D 2 T 2 . 

(b) Determine those p in V for which T(p) = p. 

(c) Determine thosep in V for which (DT — 2D)(p) = 0. 

(d) Determine those p in V for which (DT — TD) n (p) = D”(p). 

29. Let Yand D be as in Exercise 28 but let T be the linear transformation that maps p(x) onto 
xp(x). Prove that DT — TD = Z and that DT n —T n D= nT n-1 for n > 2. 

30. Let S and T be in (£(V, V) and assume that ST - TS = Z. Prove that ST n - T n S = nT n-1 
foralln > 1. 

3 1. Let V be the linear space of all real polynomialsp(x). Let/J, S, T be the functions which map 
an arbitrary polynomial p(x) = c 0 + CjX + . . + c n x n in V onto the polynomials r(x), s(x), 
and t(x), respectively, where 

r(x) = p( 0) , s(x) = ^c k x k-x , t(x) = £c k x^ , 

(a) Let p(x) = 2 + 3x — x 2 + x 3 and determine the image of p under each of the following 
transformations: R, S, T, ST, TS, (TSf, T 2 S 2 , S 2 T 2 , TRS, RST. 

(b) Prove that R, S, and Tare linear and determine the null space and range of each, 

(c) Prove that T is one-to-one onK and determine its inverse. 

(d) If n > 1, express ( TSf and S n T n in terms of Z and R. 

32. Refer to Exercise 28 of Section 16.4. Determine whether T is one-to-one on V. If it is, describe 
its inverse. 

16.9 Linear transformations with prescribed values 

If V is finite-dimensional, we can always construct a linear transformation T: V —> W 
with prescribed values at the basis elements of V, as described in the next theorem. 
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theorem 16.12. Let e v . . . , e n be a basis for an n-dimensional linear space V. Let 
Ui , . . , , U n be n arbitrary elements in a linear space W. Then there is one and only one linear 
transformation T: V —> W such that 

(16.7) T(e,) = u k for k = 1,2,... ,«• 

This T maps an arbitrary element x in Vas follows : 


(16.8) If x ='£x k e k , then T(x) = Yx k u k . 

k=l k = l 

Proof. Every x in V can be expressed uniquely as a linear combination of e ± , , e n , 

the multipliers x ± , ... , x n being the components of x relative to the ordered basis 
(e 1 , . . . , e n ). If we define T by (16.8), it is a straightforward matter to verify that T is 
Unear. If x = e k for some k, then all components of x are 0 except the kth, which is 1, so 
(16.8) gives T(e,) = u k , as required. 

To prove that there is only one linear transformation satisfying (16.7), let 7” be another 
and compute T'(x). We find that 


T'(x) = T' ( 2 x^j.) = 2 x k T'(e k ) = 2,x k u k = T(x). 

\*= 1 / k~l k=l 

Since T’(x) = T(x) for all x in V, we have 7” = T. which completes the proof. 

example . Determine the linear transformation T: V 2 — *■ which maps the basis elements 
i = (1,0) and j = (0, 1) as follows: 

T(i) = i + j , T(j) = 2i - j 

Solution. If x = x k i + x 2 j is an arbitrary element of V 2 , then T(x) is given by 
T(x) = x x T(i) + x 2 T(j) = xfi +j) + xf2i -,/) = (x x + 2x.fi + (x x - xfj . 


16.10 Matrix representations of linear transformations 

Theorem 16.12 shows that a linear transformation T: V — > W of a finite-dimensional 
linear space V is completely determined by its action on a given set of basis elements 
e x , . . . , e n ■ Now, suppose the space W is also finite-dimensional, say dim W = m, and let 
Wi , . , , , w m be a basis for W. (The dimensions n and m may or may not be equal.) Since T 
has values in W, each element T(e k ) can be expressed uniquely as a linear combination of the 
basis elements w k , . . . , w m , say 

m 

T (e k ) = 2 Ufw , , 

i=i 

where t lk , . . . , t mk are the components of T(e k ) relative to the ordered basis (uq , . . . , w,). 
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We shall display the m-tuple (t lk , . . . , t mk ) vertically, as follows: 


(16.9) 


hk 
h k 


VJmkA 

This array is called a column vector or a column matrix. We have such a column vector for 
each of the n elements T(e,), . . . , T(e,). We place them side by side and enclose them in 
One pair of brackets to obtain the following rectangular array : 


hi 

t • 

‘12 

hn 

hi 

hi ' 

hn 

ill 

fm2 

' fftm 


This array is called a matrix consisting of m rows and n columns. We call it an m by n matrix, 
or an m x n matrix. The first row is the 1 x n matrix (f n , t l2 , . . . , fi„). The m x 1 
matrix displayed in (16.9) is the kth column. The scalars t ik are indexed so the first sub- 
script i indicates the row, and the second subscript k indicates the column in which t ik 
occurs. We call t i]c the ik-entry or the ik-element of the matrix. The more compact notation 

(Q, or (t ik )lAL x , 

is also used to denote the matrix whose ik-entry is t ik . 

Thus, every linear transformation T of an n-dimensional space V into an m-dimensional 
space IV gives rise to an m x n matrix (t ik ) whose columns consist of the components of 
T(c x ), T(e,) relative to the basis (ny , . . . , w,). We call this the matrix representation 
of T relative to the given choice of ordered bases (e, , . . . , e,) for V and (h’j , . . . , w m ) for 
W. Once we know the matrix (t ik ), the components of any element T(x) relative to the 
basis (Wj , . . . , H’ m ) can be determined as described in the next theorem. 


theorem 16.13. Let T be a linear transformation in 25? ( V, W), where dim V = n and 
dim W= m. Let (e„ . ... e,) and(w k , .... W’ m ) be ordered bases for V and W, respectively, 
and let ( t ik ) be the m x n matrix whose entries are determined by the equations 

m 

(16.10) T(e k ) = 2 t ik Wi . for k = 1, 2, . . . , n . 

f= i 

Then an arbitrary element 

n 

x = 1 X A 

fc=i 


(16.11) 
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in V with components (x x , . . . , x ,) relative to (e, , . . . , e,) is mapped by T onto the element 


(16.12) T(x)= 

in W with components , . . . , y,) relative to (w x , . . . , w m ). The y i are related to the 
components of x by the linear equations 


(16.13) 


n 


k=l 


for x = 1, 2 , . . ♦ , m. 


Proof. Applying T to each member of (16.11) and using (16. 10), we obtain 


n n m min \ m 

T(x) = 2,x k T(e k ) = 2** 2 Um =22 t ik x k \w i = 2 yM , 

1 fc=l i = 1 1=1 \k=l / i=l 

where each is given by (16.13). This completes the proof. 

Having chosen a pair of bases (e, , . . . , e.) and (w 1 , . . . , w m ) for V and W, respectively, 
every linear transformation T: V— ► W has a matrix representation (t ik ). Conversely, if 
we start with any mn scalars arranged as a rectangular matrix (t ilc ) and choose a pair of 
ordered bases for V and IV, then it is easy to prove that there is exactly one linear trans- 
formation T: V — >■ W having this matrix representation. We simply define T at the basis 
elements of V by the equations in (16.10). Then, by Theorem 16.12, there is one an d only 
One linear transformation T: V —*■ W with these prescribed values. The image T(x) of an 
arbitrary point x in V is then given by Equations (16.12) and (16.13). 

EXAMPLE 1. Construction of a linear transformation from a given matrix. Suppose we 
start with the 2x3 matrix 

X 

Choose the usual bases of unit coordinate vectors for V 3 and V 2 ■ Then the given matrix 
represents a linear transformation T: V z — V 2 which maps an arbitrary vector (x x , x 2 , x 3 ) 
in V 3 onto the vector (fi , )’ 2 ) in V 2 according to the linear equations 

}’i = 3x x + x 2 — 2x 3 

y 2 = XI + 0x2 + 4X 3 . 

EXAMPLE 2. Construction of a matrix representation of a given linear transformation. 
Let V be the linear space of all real polynomialsp(x) of degree < 3. This space has dimen- 
sion 4, and we choose the basis (1, x, x 2 , X s ). Let D be the differentiation operator which 
maps each polynomial p(x) in V onto its derivative p’( x). We can regard D as a linear 
transformation of V into W, where W is the 3-dimensional space of all real polynomials 
of degree < 2. In W we choose the basis (1, x, x 2 ). To find the matrix representation of D 


'3 1 

10 
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relative to this choice of bases, we transform (differentiate) each basis element of V and 
express it as a linear combination of the basis elements of W. Thus, we find that 

D( 1) = 0 = 0 + Ox + Ox 2 , D(x) = 1 = 1 + Ox + Ox 2 , 

Z)(x 2 ) = 2x = 0 + 2x + Ox 2 , Z)(x 3 ) = 3x 2 = 0 + Ox + 3x 2 , 

The coefficients of these polynomials determine the columns of the matrix representation of 
D. Therefore, the required representation is given by the following 3x4 matrix: 

"0 10 0" 

0 0 2 0 . 

.0 0 0 3. 

To emphasize that the matrix representation depends not only on the basis elements but 
also on their order, let us reverse the order of the basis elements in IV and use, instead, the 
ordered basis (x 2 , x, 1). Then the basis elements of V are transformed into the same poly- 
nomials obtained above, but the components of these polynomials relative to the new 
basis (x 2 , x, 1) appear in reversed order. Therefore, the matrix representation of D now 
becomes 

'0 0 0 3' 

0 0 2 0. 

.0 10 0. 

Let us compute a third matrix representation for D, using the basis (1, 1 + x, 1 + X + x 2 , 

1 + x + x 2 + X 3 ) for V, and the basis (1, x, x 2 ) for W. The basis elements of V are trans- 

formed as follows: 

D(l) = 0, D(l + x) = 1 , D{\ + x + x 2 ) = 1 + 2x, 

D{ 1 + x + x 2 + x 3 ) = 1 + 2x + 3x 2 , 
so the matrix representation in this case is 

oiir 

0 0 2 2 . 

0 0 0 3 


16.11 Construction of a matrix representation in diagonal form 

Since it is possible to obtain different matrix representations of a given linear transforma- 
tion by different choices of bases, it is natural to try to choose the bases so that the resulting 
matrix will have a particularly simple form. The next theorem shows that we can make 
all the entries 0 except possibly along the diagonal starting from the upper left-hand corner 
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of the matrix. Along this diagonal there will be a string of ones followed by zeros, the 
number of ones being equal to the rank of the transformation. A matrix ( t ik ) with all 
entries t ik = 0 when i 5 ^ k is said to be a diagonal matrix. 

theorem 16.14. Let V and W be finite-dimensional linear spaces, with dim V — n and 

dim W = m. Assume 1L ££( V, W) and let r = dim T(V) denote the rank of T. Then there 

exists a basis (e, , , e,) for V and a basis (m^ , . . . , W m )for W such that 


(16.14) 

Tie,) = 

for i = 1 , 2 , . 

and 



(16.15) 

T(e 0 = 0 

for i = r + 1 


Therefore , the matrix ( t ik ) of T relative to these bases has all entries zero except for the r 
diagonal en tries 


hi — hi — ' ' ' ~ hr ~ 1 • 

Proof. First we construct a basis for W. Since T(V) is a subspace of W with dim T(V) = 
r, the space T(V) has a basis of r elements in W, say w k , . . . , w r . By Theorem 15.7, these 
elements form a subset of some basis for W. Therefore we can adjoin elements w r+1 , ... , 
W m so that 

(16.16) (m) , . . , , w r , MV+i , . . . , wj 

is a basis for W. 

Now we construct a basis for V. Each of the first r elements w t in (16.16) is the image of at 
least one element in V. Choose one such element in Vand call it e t . Then T(e t ) = w t for 
i = 1, 2, . . . so (16.14) is satisfied. Now let k be the dimension of the null space N(T). 
By Theorem 16.3 we have n = k + r. Since dim N(T) = k, the space N(T) has a basis 
consisting of k elements in V which we designate as e r+1 , ■ ■ ■ , e r _ k ■ F° r each of these 
elements. Equation (16.15) is satisfied. Therefore, to complete the proof, we must show 
that the ordered set 

(16-17) {ex,..., e r , e r+1 , . . . , e r+k ) 

is a basis for V. Since dim V = n = r + k, we need only show that these elements are 
independent. Suppose that some linear combination of them is zero, say 


T~\~k 

(16.18) J c,e, = O . 

i = l 

Applying T and using Equations (16.14) and (16.15), we find that 


r+k r 

2 c iT(e { ) = 2 c i w i - O . 


1=1 


i= 1 
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But iv, , . . . , it’ are independent, and hence c x = 1 < . = c r = 0. Therefore, the first r 
terms in (16.18) are zero, so (16.18) reduces to 


»4-fc 

2 c i e i = ° • 
i=r + 1 

But e r+1 , . . . , e r+k are independent since they form a basis for N(T), and hence c r+1 = 

' 1 ' = c r+k = 0. Therefore, all the c, in (16.18) are zero, so the elements in (16.17) form a 
basis for V. This completes the proof. 

example . We refer to Example 2 of Section 16.10, where D is the differentiation operator 
which maps the space Y of polynomials of degree < 3 into the space IT of polynomials of 
degree < 2. In this example, the range T(V) = IV, so T has rank 3. Applying the method 
used to prove Theorem 16.14, we choose any basis for W, for example the basis (1, x, .x 2 ). 
A set of polynomials in V which map onto these elements is given by (x, i.v 2 . J-.x 3 ). We 
extend this set to get a basjs for V by adjoining the constant polynomial 1, which is a basis 
for the null space of I). Therefore, if we use the basis (.x, Lx 2 , - 3 .x 3 , 1) for V and the basis 
(1, x, .x 2 ) for W, the corresponding matrix representation for D has the diagonal form 

"-1 0 0 O' 

0 10 0 
0 0 1 0 


16.12 Exercises 

In all exercises involving the vector space V • th e usual basis of unit coordinate vectors is to be 
chosen unless another basis is specifically mentioned. In exercises concerned with the matrix of 
a linear transformation T: V — > IT where V =W, we take the same basis in both V and IT unless 
another choice is indicated. 

1. Determine the matrix of each of the following linear transformations of V n into V n : 

(a) the identity transformation, 

(b) the zero transformation, 

(c) multiplication by a fixed scalar c. 

2. Determine the matrix for each of the following projections. 


(a) T: V 3 - V 2 , 

where 

T{x x , 

X 2 , ,x 3 ) = (,Xj , ,x 2 ). 

(b) T: V 3 ^V 2 , 

where 

T( Xl , 

x 2 , x 3 ) = (x 2 , x z ). 

(c) T: V 5 -> k 3 , 

where 

T(x i, 

x 2 ,-^3^4. x !>) = ( x 2 ■ x 3 , xf). 


3. A linear transformation T: V 2 —> V 2 maps the basis vectors / and j as follows: 

T(i) = i+j, T(j) = 2i - j . 

(a) Compute 7)3/ — 4 j) and T 2 (3i — 4 j) in terms of i and j. 

(b) Determine the matrix of T and of T 2 . 

(c) Solve part (b) if the basis (ij) is replaced by (cj , e 2 ), where e 1 = i — j, e 2 = 3i + j. 

4. A linear transformation T: V 2 V 2 is defined as follows: Each vector (x, y) is reflected in 
the y-axis and then doubled in length to yield T(x, }’). Determine the matrix of T and of T 2 , 

5. Let T: V 3 — > V 3 be a linear transformation such that 


T(k) = 2i + 3j + 5k , 


T(j + k)=i, 


T(i+j + k)=j-k. 
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(a) Compute T(i + 2j + 3k) and determine the nullity and rank of T. 

(b) Determine the matrix of T. 

6. For the linear transformation in Exercise 5, choose both bases to be , e 2 , e,), where e 1 = 
(2, 3, 5), e 2 = (1, 0, 0), e 3 = (0, 1, —1), and determine the matrix of T relative to the new 
bases. 

7. A linear transformation T: V 3 -> V 2 maps the basis vectors as follows: T(i) = (0, 0), T(j) = 
(1, 1), T(k) = (1, -1). 

(a) Compute T(4i — j + k) and determine the nullity and rank of T. 

(b) Determine the matrix of T. 

(c) Use the basis (/,/', k) in V 3 and the basis (vt>j , w 2 ) in V 2 , where = (1, 1), w 2 = (1, 2), 
Determine the matrix of T relative to these bases. 

(d) Find bases (cj , e 2 , e 3 ) for V 3 and (w 1 . h> 2 J for V 2 relative to which the matrix of T will be 
in diagonal form. 

8. A linear transformation T: V 2 V 3 maps the basis vectors as follows: T(i) = (1, 0, 1), 

T(j) = (-1, 0, 1). 

(a) Compute T(2i — 3j) and determine the nullity and rank of T. 

(b) Determine the matrix of T. 

( c ) Find bases (e 2 , e 2 ) for V 2 and (w 1 , w 2 , w 3 ) for V 3 relative to which the matrix of T will be 
in diagonal form. 

9. Solve Exercise 8 if T(i) = (1, 0, 1) and T(j) = (1, 1, 1). 

10. Let V and IV be linear spaces, each with dimension 2 and each with basis (e^ , e 2 ). Let T: M -* l/V 
be a linear transformation such that 

T(e i + e 2 ) = 3e x + 9e 2 , T( 3ej + 2e 2 ) = le x + 23e 2 . 

(a) Compute T (e, e x ) and determine the nullity and rank of T. 

(b) Determine the matrix of T relative to the given basis. 

(c) Use the basis {e 1 , e,) for V and find a new basis of the form (e 2 + ae, , 2e 1 + be 2 ) for W, 
relative to which the matrix of T will be in diagonal form. 

In the linear space of all real-valued functions, each of the following sets is independent and 
spans a finite-dimensional subspace V. Use the given set as a basis for V and let D : V ->■ V be 
the differentiation operator. In each case, find the matrix of D and of D 2 relative to this choice 
of basis. 

11. (sin x, cos x). 15. (-cos x, sin x). 

12. (1, x, e x ). 16. (sin x, cos x, x sin x, x cos A"), 

13. (1, 1 + x, 1 + x + e x ). 17. (e x sin x, e x cos x). 

14. ( e x , xe x ). 18. (e 2x sin 3x, e 2 * cos 3x). 

19. Choose the basis (1, x, x 2 , x a ) in the linear space V of all real polynomials of degree < 3. 
Let D denote the differentiation operator and let T: V -*> V be the linear transformation 
which maps p(x) onto xp'(x). Relative to the given basis, determine the matrix of each of the 
following transformations: (a) T; (b) DT; (c) TD; (d) TD — DT; (e) T 2 ; (f)T 2 D 2 — D l T 2 . 

20. Refer to Exercise 19. Let I N be the image of V under TD. Find bases for 1/ and for I N 
relative to which the matrix of TD is in diagonal form. 


16.13 Linear spaces of matrices 

We have seen how matrices arise in a natural way as representations of linear trans- 
formations. Matrices can also be considered as objects existing in their own right, without 
necessarily being connected to linear transformations. As such, they form another class of 
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mathematical objects on which algebraic operations can be defined. The connection 
with linear transformations serves as motivation for these definitions, but this connection 
will be ignored for the moment. 

Let m and n be two positive integers, and let I m n be the set of all pairs of integers (i, j) 
such that 1 < i < m, 1 < j < n, Any function A whose domain is I m n is called an m x n 
matrix. The function value A(i,j) is called the ij-entry or ij-efement of the matrix and will 
be denoted also by a u . It is customary to display all the function values in a rectangular 
array consisting of m rows and n columns, as follows: 

#11 #12 ' ' ' #!,« 

#21 #22 ■ ■ ' # 2 u 


L#ml a m 2 . . 1 a tnn-l 

The elements a. tJ may be arbitrary objects of any kind. Usually they will be real or complex 
numbers, but sometimes it is convenient to consider matrices whose elements are other 
objects, for example, functions. We also denote matrices by the more compact notation 

A = ( a ii)Zh or A = (%) . 

If m = n, the matrix is said to be a square matrix. A 1 x n matrix is called a row matrix; 
an m x 1 matrix is called a column matrix. 

Two functions are equal if and only if they have the same domain and take the same 
function value at each element in the domain. Since matrices are functions, two matrices 
A = (a ij ) and B = {b l] ) are equal if and only if they have the same number of rows, the 
Same number of columns, and equal entries a ti = b {j for each pair (/,_/). 

Now we assume the entries are numbers (real or complex) and we define addition of 
matrices and multiplication by scalars by the same method used for any real- or complex- 
valued functions. 


definition. If A — (a,,) and B = (b jj ) are two m x n matrices and if c is any scalar, 

we define matrices A + B and cA as follows: 

A + B = (#„■ + b ij ) , cA = (ca u ) . 

The sum is defined only when A and B have the same size. 

EXAMPLE . If 


then we have 



' 1 2 

— 3 ~ 


'5 

0 11 

A = 

-1 0 

4 

and B = 

1 

1 

m 

1 


A + B = 


'6 2 -2" 


"2 4-6' 

1 1 

i 

o 


, 2 A = 


II 

oq 

7 


.0 “2 7 _ 


-2 0 8_ 

-1 2 — 3_ 
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We define the zero matrix 0 to be the m x « matrix all of whose elements are 0. With 
these definitions, it is a straightforward exercise to verify that the collection of all m x n 
matrices is a linear space. We denote this linear space by M m „ . If the entries are real 
numbers, the space M mn is a real linear space. If the entries are complex, M mn is a complex 
linear space. It is also easy to prove that this space has dimension mn. In fact, a basis for 
M m n consists of the mn matrices having one entry equal to 1 and all others equal to 0. 
For example, the six matrices 


1 

o 

o 


i 

O 

0 

1 


1 

o 

0 

1 


i 

o 

o 

0 

1 


F 

o 

o 

o 


ro o on 

[o o o_ 

? 

1 

o 

o 

o 

1 

? 

_o o o_ 


_1 o o_ 

? 

I 

o 

0 

1 

J 

1 

o 

o 

1 


form a basis for the set of all 2 x 3 matrices. 

16.14 Isomorphism hetween linear transformations and matrices 

We return now to the connection between matrices and linear transformations. Let V 
and W be finite-dimensional linear spaces with dim V = n and dim W = m. Choose a 
basis (e, , . . . , e,) for V and a basis (w’j , . . . , u’ m ) for W. In this discussion, these bases are 
kept fixed. Let V, W) denote the linear space of all linear transformations of V into 
W. If Te V, W), let m(T) denote the matrix of T relative to the given bases. We recall 
that m(T) is defined as follows. 

The image of each basis element e k is expressed as a linear combination of the basis 
elements in W: 

m 

(16.19) T(e,) = ^ for k = 1, 2 , . . . , n. 

i=l 

The scalar multipliers t ik are the /^-entries of m(T). Thus, we have 

(16.20) m(T) = . 

Equation (16.20) defines a new function m whose domain is =£?( V, W) and whose values 
are matrices in M m n . Since every m x n matrix is the matrix m(T) for some T in V, W), 
the range of m is M m n ■ The next theorem shows that the transformation m: V, W) 

M m jl is linear and one-to-one on V, W). 


THEOREM 16.15. isomorphism theorem. For all S and T in £P{V, W) and all scalars 

c, we have 

m(S + T) = m(S) + m(T) and m{cT) = cm(T). 

Moreover, 

m(S) = m(T) implies S = T , 


so m is one-to-one on ff( V, W). 
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Proof. The matrix m(T) is formed from the multipliers t ik in (16.19). Similarly, the 
matrix m(S) is formed from the multipliers s ik in the equations 


(16.21) S(e,) = 2 s ik w i for 1,2 , . . . , n . 

i = 1 

Since we have 

TO TO 

(S + T)(e k ) = 2 (s ik + t ik )w t and ( cT)(e k ) = £ (tT 2 ,,)vv ; , 

4=1 4—1 

we obtain m(S +T)= {s ik + t ik ) = m(S) + m(T ) and m(cT) = ( ct lk ) = cm(T). This proves 
that m is linear. 

To prove that m is one-to-one, suppose that m(S) = m(T), where S = (s ik ) and T = 
( t ik ). Equations (16.19) and (16.21) show that S(e,) = T(e,) for each basis element e k , 
so S(x) = T(x) for all X in V. and hence S = T. 

Note: The function m is called an isomorphism. For a given choice of bases, m 
establishes a one-to-one correspondence between the set of linear transformations 
W) and the set of m x n matrices M m n . The operations of addition and multipli- 
cation by scalars are preserved under this correspondence. The linear spaces i?( V, W) 
and M m „ are said to be isomorphic. Incidentally, Theorem 16.11 shows that the domain 
of a one-to-one linear transformation has the same dimension as its range. Therefore, 
dim jf( V, W) = dim M m „ = mn. 

If V ~ W and if we choose the same basis in both V and W, then the matrix m(I) which 
corresponds to the identity transformation I: V — > Vis an n x n diagonal matrix with each 
diagonal entry equal to 1 and all others equal to 0. This is called the identity or unit matrix 
and is denoted by / or by l n . 


16.15 Multiplication of matrices 

Some linear transformations can be multiplied by means of composition. Now we shall 
define multiplication of matrices in such a way that the product of two matrices corresponds 
to the composition of the linear transformations they represent. 

We recall that if TV U-$.V and S: V — »• W are linear transformations, their composition 
ST: U — > W is a linear transformation given by 

ST(x) = S[T(x) ] for all .v in U . 

Suppose that U, V, and W are finite-dimensional, say 

dim U = n , dim V = p , dim W = m . 

Choose bases for U, V, and W. Relative to these bases, the matrix m(S) is an m x p 
matrix, the matrix T is a p x n matrix, and the matrix of ST is an ni x n matrix. The 
following definition of matrix multiplication will enable us to deduce the relation m(ST) =. 
m{S)m(T). This extends the isomorphism property to products. 
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definition. Let A be any m X P matrix , and let B be any p x n matrix, say 

A= a " d B = M,t,- 

Then theproduct AB js defined to be the m x n matrix C = (c i} ) whose ij-entry is given by 
( 1 6.22) + = 2 a lk b kl . 

k = 1 

Note: The product AB is not defined unless the number of columns of A is equal to 

the number of rows of B. 

If we write A i for the ith row of A, and B J for the jth column of B, and think of these as 
p-dimensional vectors, then the sum in (16.22) is simply the dot product A i • B 1 . In other 
words, the ij-entry of AB is the dot product of the ith row of A with the jth column of B: 

AB = (A i B%’fi 1 . 

Thus, matrix multiplication can be regarded as a generalization of the dot product. 


• Since A is 2 x 3 and B is 3 x 2, 





'4 6 

EXAMPLE 1 . Let A = 

'3 12' 

and B = 

5 -1 


-1 1 0 






.0 2, 


the product AB is the 2 x 2 matrix 


AB = 


~ A-! ■ B l A, 

. B l 

‘17 21 

A, ■ B 1 A a 

■ B 2 ] = 

_ 1 -1 


The entries of AB are computed as follows: 


A, ■ B 1 = 3 • 4 + 1 • 5 + 2 • 0 = 17, 


A x - B 2 = 3-6+1 ■(-!) + 2-2 = 21, 


A 2 ■ B 1 = ( - 1) • 4 -J- -5 + 0-0 = 1, A s ■ B 2 = (—1) • 6 + 1 ■ (—1) + 0 -2 = - 7 . 

EXAMPLE 2 . Let 







'-2' 


"2 1 

-3' 




A = 

1 2 

4, 

and 

B = 

1 






2 


Here A is 2 x 3 and B is 3 x 1, so AB is the 2 x 1 matrix given by 


~A t ■ B 1 ' 


1 

1 

VO 

A, ■ B\ 


1 

oo 

1 


since Ai • B 1 = 2 ■ ( — 2) + 1 • 1 + ( — 3) • 2 = — 9 and A„ ■ 5 1 = 1 • ( — 2) + 21+4-2 = 
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example 3. If A and B are both square matrices of the same size, then both AB and 
BA are defined. For example, if 


we find that 



and 




'13 

s' 1 


'-1 10' 

AB = 

_ 2 

-2 

, B A = 

3 12 


This example shows that in general AB ^ BA. If AB = BA. we say A and B commute. 


example 4. If I v is the p x p identity matrix, then fiA = A for every p x n matrix A, 
and BIj = B for every m x p matrix B. For example, 


O 

O 
1 


'2' 


'2' 

0 1 0 


3 

= 

3 

1 

O 

O 


1 

1 


1 

1 





'1 

0 

0" 




"l 2 

3“ 


0 

1 

0 


‘l 

2 3' 

4 5 

6 


_0 

0 



4 

5 6 




1_ 





Now we prove that the matrix of a composition ST is the product of the matrices m(S) 
and m(T). 


THEOREM 16.16. Let T‘. V — > V and S: V — ► W be linear transformations, where If, V, 
arejnite -dimensional linear spaces. Then, for a fixed choice of bases, the matrices of S, 
and ST are related by the equation 

m(ST) = m(S)m(T ) . 


Proof. Assume dim U = n, dim V = p, dim W = m. Let (u l , ■ ■ ■ , u n ) be a basis for 
U, (i?! , . . . , v v ) a basis for V, and (yv 1 , . . . , w,) a basis for W. Relative to these bases, we 
have 

m 

m(S) = f , where S(v k ) = J s ik w { for k = 1, 2 , . . . , 

2—1 

and 

m(T) = (t { j)lZi > where T K) = for j = 1,2,..., n. 

Therefore, we have 


STfij) S[T( Uj )] =lt kj S(v k ) = 2 W = J ls ik t kj w, , 

k= 1 fc=l 2 = 1 2 = 1 \fc=l 


SO we find that 


( 2 > \m,n 

= m(S)m(T) ■ 

*,=1 /i, 3=1 


We have already noted that matrix multiplication does not always satisfy the com- 
mutative law. The next theorem shows that it does satisfy the associative and distributive 
laws. 
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THEOREM 16.17. ASSOCIATIVE AND DISTRIBUTIVE LAWS FOR MATRIX MULTIPLICATION. 

Given matrices A. B, C. 

(a) V the products A(BC) and (AB)C are meaningful, we have 
A(BC) = (AB)C (associative Zaw) . 


(b) Assume A and B are of the SCltne size. If AC and BC are meaningful, we have 
(A + B)C = AC + BC (right distributive law) , 

whereas if CA and CB are meaningful, we have 

C(A + B) = CA + CB ( left distributive law ) . 

Proof. These properties can be deduced directly from the definition of matrix multi- 
plication, but we prefer the following type of argument. Introduce finite-dimensional 
linear spaces U, V, IV, X and linear transformations T: U — > V, S: V --> W, R : W-f X 
such that, for a fixed choice of bases, we have 

A = m(R), B = m(S), C = m(T). 


By Theorem 16.16, we have m(RS) = AB and m(ST) = BC. From the associative law for 
composition, we find that R(ST) = ( RS)T . Applying Theorem 16.16 once more to this 
equation, we obtain m(R)m(ST ) = m(RS)m(T) or A(BC) = (AB)C, which proves (a). The 
proof of(b) can be given by a similar type of argument. 


DEFINITION. 

follows ; 


If A is a square matrix, we define integral powers of A inductively as 
A 0 = I, A n == AA n ~ 2 for n > 1. 


16.16 Exercises 


1. If A = 


1 -4 2' 

-1 4 -2. 



- 1 2" 


"2 2" 

B = 

-1 3 

,C = 

1 -1 


" 1 
CA 

f 

<o 

I 


1 -3 


, compute B + C, A 


B 


BA, AC, CA, A(2B - SC). 

2. Let A = ^ Find all 2 x 2 matrices B such that (a) AB = 0; (b) BA = 0. 

3. In each case find a, b, c, d to satisfy the given equation. 


0 1 0" 


a" 


T 

0 0 0 


b 


9 

1 0 0 


c 


6 

0 0 1. 


d_ 


5 


(b) 


abed' 
1 4 9 2. 


1 

0 

0 

0 


0 

0 

1 

0 


2 O' 
1 1 
0 0 
1 0 


'1 0 6 6 - 
.1 9 8 4. 
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4. Calculate AB — BA in each case. 



'1 2 

2" 


r 4 

1 1 



(a) A = 

2 1 

2 

, £ = 

-4 

2 0 

; 



_1 2 

3 _ 


1 

2 lj 




' 2 

0 

0 ' 


' 3 

1 

—2 

(b) A = 

1 

1 

2 

, B = 

3 

-2 

4 


-1 

2 

1 ^ 


—3 

5 

11 _ 


6. Let A = 

7. Let A = 


. Verify that A 2 = L 


0 11 
"cos 6 — sin 6 
sin 6 cos 6 


and compute A n . 


. Verify that A 2 = 


“cos 2d 
sin 20 


-sin 20' 
cos 20 



'1 1 r 


“1 2 3" 

8. Let A = 

0 1 1 
0 0 1 

. Verify that A 2 = 

0 1 2 
_0 0 1 


and compute A n . 

. Compute A 3 and A 4 . Guess a general 


formula for A n and prove it by induction. 


9. Let A 


1 ^ Prove that A 2 = 2 A — Z and compute A 100 . 


10. Find all 2 x 2 matrices A such that A 2 = 0. 

11. (a) Prove that a 2 x 2 matrix A commutes with every 2x2 matrix if and only if A commutes 
with each of the four matrices 


ri on 


ro r 


o 

o 


ro 01 

O 

O 

9 

.0 o_ 

> 

0 

1 


1 

O 


(b) Find all such matrices A. 

12. The equation A 2 = Z is satisfied by each of the 2 x 2 matrices 


"1 0" 


r 1 °i 


r 1 b i 

0 lj 

5 

i 

0 

1 


— i 

7 

o 

1 


where b and c are arbitrary real numbers. Find all 2 x 2 matrices A such that A 2 = Z. 


13. If A 


and B 


“7 6 
9 8 


2 -1 
-2 3 

and DA = B. 

14. (a) Verify that the algebraic identities 

(A + Bf = A 2 + 2AB + B- 


, find 2x2 matrices C and D such that AC 


B 


and 


(A + B)(A - B) = A 2 - B 2 


do not hold for the 2 x 2 matrices A 


"i -r 

o 2 


and B = 


"1 0" 
J 2.' 


(b) Amend the right-hand members of these identities to obtain formulas valid for all square 
matrices A and B. 

( c ) For which matrices A and B are the identities valid as stated in (a)? 
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16.17 Systems of linear equations 

Let A = (a i3 ) be a given m x n matrix of numbers, and let c x , ... , c m be m further 
numbers. A set of m equations of the form 


(16.23) 1 a ik x k = c i for i = 1, 2, ... , m , 

i 

is called a system of m linear equations in n unknowns. Here Xi , ■ ■ ■ , x n are regarded as 
unknown. A solution of the system is any n-tuple of numbers (x l , . . . , x,) for which all the 
equations are satisfied. The matrix A is called the coefficient-matrix of the system. 

Linear systems can be studied with the help of linear transformations. Choose the usual 
bases of unit coordinate vectors in V and in V m . The coefficient-matrix A determines a 
linear transformation, T: V„ -> V m , which maps an arbitrary vector x = (jq , . . . , x,) in V n 
onto the vector y = (j x , . . . , y,) in V m given by the m linear equations 


Vi = jLfikXk for * = 1,2, , m. 

Let c = (cj , . . . , C m ) be the vector in V m whose components are the numbers appearing in 
system (16.23). This system can be written more simply as 

T(x) = c . 

The system has a solution if and only if c is in the range of T. If exactly one x in V n maps 
onto c, the system has exactly one solution. If more than one x maps onto c, the system 
has more than one solution. 

example 1. A system with no solution. The system x + y = 1, x + y = 2 has no solu- 

tion. The sum of two numbers cannot be both 1 and 2. 

example 2. A system with exactly one solution. The system x + y = 1, x — y=0 has 

exactly one solution: (x, y) = (J, |). 

example 3. A system with more than one solution. The system x + y = 1, consisting 
of one equation in two unknowns, has more than one solution. Any two numbers whose 
sum is 1 gives a solution. 

With each linear system (16.23) we can associate another system 
n 

2, a ik x k = Q for i = l,2,...,m, 

k = 1 

obtained by replacing each c { in (16.23) by 0. This is called the homogeneous system corre- 
sponding to (16.23). If c 5 ^ 0, system (16.23) is called a nonhomogeneous system. A vector 
x in V n will satisfy the homogeneous system if and only if 


T(x) = 0 
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where T is the linear transformation determined by the coefficient-matrix. The homogene- 
ous system always has one solution, namely x = 0, but it may have others. The set of 
solutions of the homogeneous system is the null space of T. The next theorem describes the 
relation between solutions of the homogeneous system and those of the nonhomogeneous 
system. 


theorem 16 . 18 . Assume the nonhomogeneous system (16.23) has a solution, say b. 

(a) I f a vector x is a solution of the nonhomogeneous system, then the vector v = x — b 
is a solution of the corresponding homogeneous system. 

(b) I f a vector v is a solution of the homogeneous system, then the vector x = v + b is a 
solution of the nonhomogeneous system. 

Proof. Let T : V n — > V m be the linear transformation determined by the coefficient- 
matrix, as described above. Since b is a solution of the nonhomogeneous system we have 
T(b) = c. Let x and v be two vectors in V n such that v = x — b. Then we have 

T(v) = T(x - b) = T(x) - T(b) = T(x) - c 
Therefore T(x) = c if and only if T(v) = 0. This proves both (a) and (b). 

This theorem shows that the problem of finding all solutions of a nonhomogeneous 
system splits naturally into two parts: (1) Finding all solutions v of the homogeneous 
system, that is, determining the null space of T; and (2) finding one particular solution b of 
the nonhomogeneous system. By adding b to each vector v in the null space of T, we thereby 
obtain all solutions x = V + b of the nonhomogeneous system. 

Let k denote the dimension of N(T) (the nullity of T). If we can find k independent 
solutions n, , . . . , v k of the homogeneous system, they will form a basis for N(T), and we 
Can obtain every v in N(T) by forming all possible linear combinations 


v - fVi + ■ 1 ' + t k v k , 

where tj , . . . , t k are arbitrary scalars. This linear combination is called the general solution 
of the homogeneous system. If b is one particular solution of the nonhomogeneous system, 
then all solutions x are given by 

x = b + t lVl + " ■ + t k V k . 

This linear combination is called the general solution of the nonhomogeneous system. 
Theorem 16.18 can now be restated as follows. 

theorem 16.19. Let T: V n — >■ V m be the linear transformation such that T(x) = y, where 
X = (x 1 ,..., x n ), y = Or y m ) and 



k = 1 


for i = 1, 2, , 


m . 
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Let k denote the nullity of T. If V 1 , , V k are k independent solutions of the homogeneous 

system T(x) = 0, and if b is one particular solution of the nonhomogeneous system T(x) = c, 
then the general solution of the nonhomogeneous system is 

x = b + tjVx + . . . + t k V k , 

where t k , ■ ■ , , t k are arbitrary scalars. 

This theorem does not tell us how to decide if a nonhomogeneous system has a particular 
solution b, nor does it tell us how to determine solutions v 1 , ■ ■ ■ , v k of the homogeneous 
system. It does tell us what to expect when the nonhomogeneous system has a solution. 
The following example, although very simple, illustrates the theorem. 

example. The system x + y = 2 has for its associated homogeneous system the equation 
x + y = 0. Therefore, the null space consists of all vectors in V 2 of the form (t, —t), 
where / is arbitrary. Since ( t , -t) = ?(1, — 1), this is a one-dimensional subspace of V 2 
with basis (1, — 1). A particular solution of the nonhomogeneous system is (0, 2). There- 
fore the general solution of the nonhomogeneous system is given by 

(x,y) = (0, 2) + f(l, — 1) or x = t, y = 2 — t , 
where t is arbitrary. 

16.18 Computation techniques 

We turn now to the problem of actually computing the solutions of a nonhomogeneous 
linear system. Although many methods have been developed for attacking this problem, 
all of them require considerable computation if the system is large. For example, to solve 
a system of ten equations in as many unknowns can require several hours of hand com- 
putation, even with the aid of a desk calculator. 

We shall discuss a widely-used method, known as the Gauss-Jordan elimination method, 
which is relatively simple and can be easily programmed for high-speed electronic computing 
machines. The method consists of applying three basic types of operations on the equations 
of a linear system: 

(1) Interchanging two equations ; 

( 2 ) Multiplying all the terms of an equation by a nonzero scalar; 

(3) Adding to one equation a multiple of another. 

Each time we perform one of these operations on the system we obtain a new system having 
exactly the same solutions. Two such systems are called equivalent. By performing these 
operations over and over again in a systematic fashion we finally arrive at an equivalent 
system which can be solved by inspection. 

We shall illustrate the method with some specific examples. It will then be clear how the 
method is to be applied in general. 

example 1. A system with a unique solution. Consider the system 

2x — 5y + 4z = - 3 
x — 2 y + z = 5 
x — 4 y + 6 z = 10. 
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This particular system has a unique solution, x = 124, y = 75, Z = 31, which we shall 
obtain by the Gauss-Jordan elimination process. To save labor we do not bother to copy 
the letters x, y, z and the equals sign over and over again, but work instead with the aug- 
mented matrix 



P 

-5 4 

-3" 

(16.24) 

1-21 

5 


1 

-4 6 

10 


obtained by adjoining the right-hand members of the system to the coefficient matrix. The 
three basic types of operations mentioned above are performed on the rows of the augmented 
matrix and are called row operations. At any stage of the process we can put the letters 
x, y, z back again and insert equals signs along the vertical line to obtain equations. Our 
ultimate goal is to arrive at the augmented matrix 



1 

0 

0 

124~ 

(16.25) 

0 

1 

0 

75 


0 

0 

1 

31 


after a succession of row operations. The corresponding system of equations is x = 124, 
y = 75, z = 31, which gives the desired solution. 

The first step is to obtain a 1 in the upper left-hand corner of the matrix. We can do this 
by interchanging the first row of the given matrix (16.24) with either the second or third 
row. Or, we can multiply the first row by f Interchanging the first and second rows, we get 


'1 

-2 

1 

5" 

2 

-5 

4 

—3 

1 

-4 

6 

10 


The next step is to make all the remaining entries in the first column equal to zero, leaving 
the first row intact. T£> do this we multiply the first row by -2 and add the result to the 
second row. Then we multiply the first row by — 1 and add the result to the third row. 
After these two operations, we obtain 




“i 

-2 

1 

5" 

(16.26) 


0 

-1 

2 

-13 



_0 

-2 

5 

5_ 

Now we repeat the process 

on 

the 

smaller 

matrix 



which appears 


adjacent to the two zeros. We can obtain a 1 in its Upper left-hand corner by multiplying 
the second row of (16.26) by — 1. This gives us the matrix 


'1 

-2 

1 

5" 

0 

1 

-2 

13 

_0 

-2 

5 

5_ 
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Multiplying the second row by 2 and adding the result to the third, we get 



"1 

-2 

1 

5' 

(16.27) 

0 

1 

-2 

13 


_0 

0 

1 

31_ 


At this stage, the corresponding system of equations is given by 

x — 2 y + z = 5 
y — 2z = 13 
z = 31. 


These equations ean be solved in succession, starting with the third one and working 
backwards, to give us 

z = 31, y - 13 + 2z = 13 + 62 = 75, x = 5 + 2y - z = 5 + 150 - 31 = 124. 


Or, we can continue the Gauss-Jordan process by making all the entries zero above the 
diagonal elements in the second and third columns. Multiplying the second row of (16.27) 
by 2 and adding the result to the first row, we obtain 


Llll III -3-21 


II M 


Finally, we multiply the third row by 3 and add the result to the first row, and then multiply 
the third row by 2 and add the result to the second row to get the matrix in (16.25). 

example 2. A system with more than one solution. Consider the following system of 3 
equations in 5 unknowns: 


2x — 5y + 4z + u - v = - 3 

(16.28) x — 2y + z - u v = 5 

x — 4y + 6 z + 2u — v = 10. 

The corresponding augmented matrix is 


2 

-5 

4 

1 -1 

- 3 

1-21-1 


1 


1 

4 

6 

2-1 

10 


The coefficients of x, y, z and the right-hand members are the same as those in Example 1. 
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If we perform the same row operations used in Example 1 , we finally arrive at the augmented 
matrix 


III -16 -3 -9 III 


The corresponding system of equations can be solved for x, y, and 2 in terms of U and v, 
giving us 

x = 124 + 16 u — 19t> 
y = 75 + 9u — 1 1 v 
z = 31 + 3 m — 4v. 

If we let u = ti and v = t 2 , where t 1 and t 2 are arbitrary real numbers, and determine 
x, y, z by these equations, the vector (x, y, z, u, v) in V s given by 

( x,y , z, u, v) = (124 + 16/j — 19/ 2 , 75 + 9f — lit, , 31 + 3 1 1 — 4 1 2 , t 1 , t 2 ) 

is a solution. By separating the parts involving and t 2 , we can rewrite this as follows: 

(x,y, z, w, v) = (124, 75, 31, 0, 0) + ^(16, 9, 3, 1, 0) + t 2 (- 19, -11, -4, 0, 1). 

This equation gives the general solution of the system. The vector (124, 75, 31, 0, 0) is a 
particular solution of the nonhomogeneous system (16.28). The two vectors (16, 9, 3, 1, 0) 
and (- 19, — 11, -4, 0, 1) are solutions of the corresponding homogeneous system. Since 
they are independent, they form a basis for the space of all solutions of the homogeneous 
system. 


example 3. A system with no solution. Consider the system 

2x — 5j + 4z = - 3 
(16.29) x-2y + z= 5 

x — 4y + 5z = 10. 

This system is almost identical to that of Example 1 except that the coefficient of z in the 
third equation has been changed from 6 to 5. The corresponding augmented matrix is 




.ill -5-21-445 

-ill): i 


Applying the same row operations used in Example 1 to transform (16.24) into (16.27), we 
arrive at the augmented matrix 


‘l 

-2 

1 

5~ 

0 

1 

-2 

13 

_0 

0 

0 

31. 


(16.30) 
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When the bottom row is expressed as an equation, it states that 0 = 31. Therefore the 
original system has no solution since the two systems (16.29) and (16.30) are equivalent. 

In each of the foregoing examples, the number of equations did not exceed the number 
of unknowns. If there are more equations than unknowns, the Gauss-Jordan process is 
still applicable. For example, suppose we consider the system of Example 1, which has the 
solution x = 124, y = 75, z = 31. If we adjoin a new equation to this system which is also 
satisfied by the same triple, for example, the equation 2x — 3y + z = 54, then the elimina- 
tion process leads to the agumented matrix 


"l 0 0 

124" 

0 10 

7.5 

0 0 1 

3 1 

_000 

Q_ 


with a row of zeros along the bottom. But if we adjoin a new equation which is not satisfied 
by the triple (124, 75, 31), for example the equation x + y + z = 1, then the elimination 
process leads to an augmented matrix of the form 



i 

0 1 0 

T. 

0 0 1 

3 1 

u° (9 Q 

» 1 


where a#0. The last row now gives a contradictory equation 0 = a which shows that 
the system has no solution. 

16.19 Inverses of square matrices 

Let A = (a fj ) be a square n x n matrix. If there is another II x n matrix B such that 
BA = I, where Z is the /ixn identity matrix, then A is called nonsingular and B is called a 
left inverse of A. 

Choose the usual basis of unit coordinate vectors in V n and let T: V„ — > V n be the linear 
transformation with matrix m(T) = A. Then we have the following. 

theorem 16.20. The matrix A is nonsingular if and only if T is invertible. If BA = I 
then B = m(T~ l ). 

Proof. Assume that A is nonsingular and that BA = I. We shall prove that T(x) = 0 
implies x = 0. Given x such that T(x) = 0, let X be the n x 1 column matrix formed from 
the components of x. Since T(x) = 0, the matrix product AX is an n x 1 column matrix 
consisting of zeros, so B(AX ) is also a column matrix of zeros. But B(AX ) = (BA) X = 
IX = X, so every component of x is 0. Therefore, Tis invertible, and the equation TT~ l — A 
implies that m(T)m(T : ) = Z or Am(T~ l ) = Z. Multiplying on the left by B. we find 
m(T -1 ) = B. Conversely, if T is invertible, then T~ X T is the identity transformation so 
m( T~ 1 )m( T) is the identity matrix. Therefore A is nonsingular and m(T^ l )A = Z. 
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All the properties of invertible linear transformations have their counterparts for non- 
singular matrices. In particular, left inverses (if they exist) are unique, and every left 
inverse is also a right inverse. In other words, if A is nonsingular and BA = /, then 
AB = Z. We call B the inverse of A and denote it by A- 1. The inverse A -1 is also nonsingular 
and its inverse is A. 

Now we show that the problem of actually determining the entries of the inverse of a 
nonsingular matrix is equivalent to solving n separate nonhomogeneous linear systems. 

Let A = (a,,) be nonsingular and let A-’ = (b ij ) be its inverse. The entries of A and 
A~ l are related by the n 2 equations 


n 

(16.31) J,a ik b kj = d u> 

k= 1 

where d tj = 1 if i = j, and d ij = 0 if i A j- For each fixed choice of], we can regard this 
as a nonhomogeneous system of n linear equations in n unknowns b v ,b 2j , b nj . Since 
A is nonsingular, each of these systems has a unique solution, the jth column of B. All 
these systems have the same coefficient-matrix A and differ only in their right members. 
For example, if A is a 3 x 3 matrix, there are 9 equations in (16.31) which can be expressed 
as 3 separate linear systems having the following augmented matrices: 


#n 

a 12 

#13 

r 


"fill 

a 12 

a \3 

O' 


Xi 

#12 

#13 

0" 

#21 

Cl 22 

#23 

0 

> 

a 21 

a 22 

a 23 

1 

5 

#21 

#22 

#23 

0 

_#31 

^32 

a 33 

0_ 


_ fl 31 

a 31 

a 33 

0 


_#31 

#32 

#33 

1 


If we apply the Gauss-Jordan process, we arrive at the respective augmented matrices 


'1 

0 

0 

bn] 


'1 

0 

0 

b 12 


'1 

0 

0 

0 

1 

0 

b 2i 


0 

1 

0 

b 22 

y 

0 

1 

0 

_0 

0 

1 

^31_ 


_0 

0 

1 

bs2_ 


_0 

0 

1 


In actual practice we exploit the fact that all three systems have the same coefficient-matrix 
and solve all three systems at once by working with the enlarged matrix 


#d#31 #22 Hv 


II . 


The elimination process then leads to 


1 0 
0 1 
0 0 


0 b n 

0 b 2l 

1 b 31 


bi2 

b 2 2 

bs2 



The matrix on the right of the vertical line is the required inverse. The matrix on the left 
of the line is the 3 x 3 identity matrix. 
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It is not necessary to know in advance whether A is nonsingular. If A is singular (not 
nonsingular), we can still apply the Gauss-Jordan method, but somewhere in the process 
one of the diagonal elements will become zero, and it will not be possible to transform A 
to the identity matrix. 


16.20 Exercises 


Apply the Gauss-Jordan elimination process to each of the following systems. If a solution 
exists, determine the general solution. 


1. x + _y + 3z = 5 
2x - y + 4z = 11 

■ y + z = 3 . 

2. 3 x + 2y + z = l 
5 x + 3y + 3z = 2 

x + y - z = 1 . 

3. lx + 2_y + z = 1 
5x + 3y +3 Z = 2 
7x + 4y + 5z = 3. 

4. 3 x +2 y z = 1 

5x+3y+3z=2 
7x +4y + 5 Z = 3 

x + y ■ z = 0 


5 3x - 2y + 5z + U = 1 
x + y — 3z + 2« = 2 
6x + y - 4z + 3w = 7. 

6. x + y — 3z + u = 5 

2 x ■ y + Z-2«=2 

lx +y —7 Z + 3u = 3. 

7. x + y + 2z + 3u + 411 -o 

2x + 2y + 7z + llll + 14v = 0 

3x + 3y + 6z + lOw + 15t> = 0. 

8. x— 2y+z+2u=-2 

2 x +3 y — z - 5 u = 9 

4 x - y + z ■ it = 5 

5x — 3y + 2z + a = 8, 


9 . Prove that the system x + y + 2z = 2, 2x — y + 3z = 2, 5x -y + az = 6, has a unique 
solution if a ^ 8. Find all solutions when a = 8. 

10. (a) Determine all solutions of the system 


5x + 2y — 6z + 2 u = -1 
x ■ y + z - u = -2. 


(b) Determine all solutions of the system 

5x + 2y — 6z + 2u — —1 

x — y + z — u = - 2 

x + y + z = 6. 

11. This exercise tells how to determine all nonsingular 2x2 matrices. Prove that 


~a 

b~ 

" d 

-b~ 

_c 

d „ 

_ — c 

a_ 


= (ad — bc)I , 


Deduce 



is nonsingular if and only if ad « be 0, in which case its inverse is 


a d - b 


d 

c ■ 
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Determine the inverse of each of the matrices in Exercises 12 through 16. 


12 . 


' 2 3 
2 1 
-1 1 


4 " 

1 . 
2 


15 . 


'1 2 
0 1 
0 0 
0 0 


3 4 " 
2 3 
1 2 
0 1 


13. 


14 . 


1 2 
. 2-1 
1 3 

" 1 -2 

-2 5 

1 -4 



-0 1 0 0 0 0 ' 
2 0 2 0 0 0 

0 3 0 1 0 0 

0 0 1 0 2 0 

0 0 0 3 0 1 

0 0 0 0 2 0 


16.21 Miscellaneous exercises on matrices 


1 . If a square matrix has a row of zeros or a column of zeros, prove that it is singular. 

2. For each of the following statements about //x n matrices, give a proof or exhibit a counter 
example. 

(a) If AB + BA= 0. then A 2 B 3 = B 3 A 2 . 

(b) If A and B are nonsingular, then A + B is nonsingular. 

(c) If A and B are nonsingular, then AB is nonsingular. 

(d) If A, B, and A + B are nonsingular, then A — B is nonsingular. 

(e) If = 0, then A — / is nonsingular. 

(f) If the product of k matrices . . zl^is nonsingular, then each matrix ^4 ; is nonsingular. 


3. If/f 


1 2 


. 5 f 

4. The matrix A = 


find a nonsingular matrix P such that P l AP = 


"a •i 

j L 


where i 2 = -1, a = ^(1 + V 5), and b = 10 — V 5), has the prop- 


erty that A 2 = A. Describe completely all 2 x 2 matrices A with complex entries SUCh that 
A 2 = A. 


5. If A 2 = A, prove that (A + l) k = Z -)- (2 fc — l)A. 

6. The special theory of relativity makes use of a set of equations of the form x’ = a(x — vt), 
y" = y, z’ = z, t' = a(t — vxjc 2 .) _Here v represents the velocity of a moving object, c 
the speed of light, and a = c/Vc 2 “ V 2 , where |[>j < c. The linear transformation which 
maps the two-dimensional vector (x, t) onto (x\ t') is called a Lorentz transformation. Its 
matrix relative to the usual bases is denoted by L(v) and is given by 


L(v) — a 


1 -v~ 

— vc ~ 2 1 _ 


Note that L{v) is nonsingular and that L(0) = Z. Prove that L(v)L(u ) = L(w), where ty = 
(u + v)c 2 j(uv + c 2 ). In other words, the product of two Lorentz transformations is another 
Lorentz transformation. 
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7. If we interchange the rows and columns of a rectangular matrix A, the new matrix so obtained 
is called the transpose of A and is denoted by A*. For example, if we have 




"1 

4' 

T 2 3' 

then A 1 = 

2 

5 

J 5 6, 






_3 

6_ 


Prove that transposes have the following properties: 

(a) (A<y = A. (b) (A + BY = A* + BK (c) ( cAf = cAK 

(d) (AB)* = B t A t , (e) (A f ) _1 = (A -1 )* if A is nonsingular. 

8. A square matrix A is called an orthogonal matrix if AA t = / Verify that the 2x2 matrix 

~COS 0 -sin 6 

is orthogonal for each real 0. If A is any fl x /) orthogonal matrix, prove 

_:sin 8 COS 01 

that its rows, considered as vectors in V n , form an orthonormal set. 

9. For each of the following statements about It x n matrices, give a proof or else exhibit a 
counter example. 

(a) If A and B are orthogonal, then A + B is orthogonal. 

(b) If A and B are orthogonal, then AB is orthogonal. 

(c) If A and AB are orthogonal, then/? is orthogonal. 

10. Hadamard matrices, named for Jacques Hadamard (1865-1963), are those n x n matrices 
with the following properties: 

1. Each entry is 1 or -1. 

H- Each row, considered as a vector in V n , has length \/~n . 

III. The dot product of any two distinct rows is 0. 

Hadamard matrices arise in certain problems in geometry and the theory of numbers, and 
they have been applied recently to the construction of optimum code words in space com- 
munication. In spite of their apparent simplicity, they present many unsolved problems. The 
main unsolved problem at this time is to determine all It for which an // x // Hadamard matrix 
exists. This exercise outlines a partial solution. 

(a) Determine all 2 x 2 Hadamard matrices (there are exactly 8). 

(b) This part of the exercise outlines a simple proof of the following theorem: If A is an 
II x n Hadamard matrix, where n > 2, then n is a multiple of 4. The proof is based on two 
very simple lemmas concerning vectors in n-space. Prove each of these lemmas and apply 
them to the rows of Hadamard matrix to prove the theorem. 


lemma 1 . If X,Y, Z are orthogonal vectors in V n , then we have 

(X + Y)-(X+Z) = \\X\\\ 

lemma 2. Write X = (x 1 , . . . , x,), Y = Oh , . . . , y,), Z = (z x , . . . , z n ). If each com- 
ponent x t , \’i , z, is either 1 or — 1, then the product (x t + y^)(x i + Z f ) is either 0 or 4. 
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*1 1.4 Exercises (page 8) 


1. 

(a) lb 3 

(b) b 3 

( c ) 'igb 3 

(d) | b 3 + b 

2. 

(c) |a6 4 

+ be 




. (b) s n 

b w 

ab k+l 

3 

< k+l <S 

n C fc + 1 

+ b c 


(e) lab 3 + be 


12.5 Exercises (page 15) 


1. A = (1, -1}, B = {1}, C = {1}, D =_{2}, E = {1, -17}, 

F = (1, -17, -8 + V47, -8 - a/47}. 

2 . A c A, B c A, B £ B, B c C, B c E, B c F, C £ C g B, C g C, C £ £, C c F, 
D c D, E cz E, E c F, F c: F. (Not counting “proper” inclusions.) 

3 . (a) True (b) True (c) False (d) True (e) False (f) False 

4. (a) True (b) True (c) True (d) True (e) False (f) False 


5. 0,{1}, {2}, {3}, {4}, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}, {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, 
{2, 3, 4}, 5 

6 . (a) False (b) False (c) False (d) True (e) False (f) False 

(g) True (h) False (i) True 

17. (c) A c c (d) Yes (e) No 


14.4 Exercises (page 35) 

2 . 1 - 4+9 — 16 + -- + (-1 ) n+ V = ( — 1)" +1 (1 + 2 + 3 + 


+ «) 


1 1 

3 ■ 1 + 2 + 4 + 


1 1 

+ — = 2 

2 n 2 n 




5 . 


n + 1 

2 n 


6. (b) A(l) is false (c) I +2 + 1 ' ' + n < 


7 . =3 


C In + l) 2 
8 


14.7 Exercises (page 39) 

1. (a) 10 (b) 15 (c) 170 (d) 288 (e) 3 6 (f) f 

8. (b) n + 1 

617 



618 


Answers to exercises 


9 . Constant = 2 
11 . (a) True (b) False 


12 . 


n 

n + 1 


(c) False 


(d) False 


(e) False (f) False 


14.9 Exercises (page 43) 

2. (flj , b^j, (fl 2 , 6 5 ), ( a 3 , b 7 ), (a i , 6 1() ), (a 5 , b 3 ), (a 6 , b s ), (a 7 , 6 9 ), (o 8 , 64), (« 8 , ^u), (< 7 u j 64) 

3 . (a) False (b) True (c) True (d) False (e) False 


*1 4.10 Miscellaneous exercises involving induction (page 44) 

(d) 21 (e) 680 


1. (a) 

10 (b) 1 

(c) 

7 

2. (b) 

17 (c) 9 

(d) 

No 

0 

«+ 1 


n 

5 . TT fl * 

ic=l 

= 1 ; J\a k = 

k= 1 

a n+ 1 

n 


8 . 2 n 

9 . True if each a k > 0 
11. n> 4 


(0 1 


Chapter 1 


3 + b + 1, 


1.5 Exercises (page 56) 

1. f(2) = 3, /( —2) = -1, -f(2) = -3 ,/(i) = |, l//(2) = I fUi + b) 
f{a) + f ib ) = a + b + 2, f(a)f(b) = ab + a + 6+1 

2. /(2) + *(2) = 2,/(2) - g(2) = 4, /(2) <? (2) = -3,/(2)/ <? (2) = -3,/[^(2>] = 0, 
gif (2)] = —2, f(a) +g(-a) = 2 + 2 a,f(t)g(-t) = (i + tf 

3. <P( 0) = 4, ?»(!) = 2, cp(2) = 2, ijp(3) = 2, <p(-\) = 6, <p(- 2) =8,t = l. 

4. (a) All x (b) All *andy (c) All x and h (d) All y (e) All t 
(f) All a 


H - \ 


5. (a) x\ <2 (b) \y\ < 1 (c) 

(f) x\ <2, x *0 

6. (b) {x | 0 < x < 1} (c) {x 2 <x< 4} 

7 . Intersect when x = 0 , 1 , — 1 

8 . Intersect when x = - 1 , -3 

10 . (a) p(x) = 1 (b) p(x) = |x(x — 1) + 1 

(d) p(x) = ax(x — 1) + b, a and b arbitrary 

11 . (a) p(x) = ax( 1 — x) + b, a and b arbitrary 

(c) p(x) = ux, a arbitrary (d) p(x) = c, c arbitrary 


(d) 0 < a <4 (e) 

(d) Domain is empty 


< 4 


(c) p(x) = ax(x — 1) + 1, a arbitrary 
(b) p(x) = c, c arbitrary 


12. (a) 


i{iy 

k=Qr ' ' k=o 


X'" 


(C) 


1.11 Exercises (page 63) 

n— 3 


5. [nx] 


= 2 x+ 


1.15 Exercises (page 70) 

1. (a) 2 (b) 4 (c) 6 (d) 4 (e) 6 (f) -6 

2 . Oneexample: y(x) = f if 0 < x < 2 , s(;t) = - 1 if 2 < x < 5 
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s ' 1 b| 2 ILi k (Vk + 1 Vk) = 2(21 - 3\/2 - V3 - a/5 - V6 - VD 

6 . (() X = 1, X = ® 

2, (a) 13 

10 . (a) f ( 3) = l,/(4) = -1./IA3)] = 0 (b) p = 14, p -- 15 

11. (a), (d), (e) 

12. (a), (b), (c) 

1.26 Exercises (page 83) 


1. 

9 


e. 2 

1 1 - 2 - 1 

11. g 

IS. 

62 

27 

2, 

1 1 


7. 0 

12. 18 

12, 

■ 78 

3, 

1 s 


i. 0 

13. a 

18. 

2592 

35 ” 

4, 

0 


9 . s 

14 _ 1 

1,1 3 

19. 

5 8 /21 

S, 

1 


10. 11 

15. 2 

20. 

— 2 n /l 1 

21. 

(a) 

0, § 

(b) 0 




22. 

la) 

| (b ) 

c/2 




23. 

P( x) 

= 6 x — 

6x 2 




24. 

p(x) 

= 4x + 

8x 2 + 3x 3 




22, 


(b - a)f(B) if A = 0. 
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2 . 4 Exercises (page 94) 


1 32 

1 • 3 


9, 

K 5\/5 - 3) 

- 32 

4. 3 


10. 

7 

4 

3. 3 


11. 

7 

3 

4- t 


12. 

7 

3 

5. -A> 


13 

9V3 - 1 

12 



27 

6 4\/2 3\ 3 /2 1 


14. 

5 

3 2 + 12 




4\/2 3\ 3/ 2 1 


1 5 . 

c = A 

'■ 3 2 + 6 




8. JO0-4V2) 


16. 

a = -2 

17. (a) 9irj2 (b) 77-/2 

1 C ) — 6j7 




2.8 Exercises (page 104) 

Note: In Exercises 1 through 13, n denotes an arbitrary integer. 

1 . ( b ) \tt tlTT 

2. (a) + 2 mr (b) 2 tin ( c ) + 2«7t (d) (2 n + \)tt 

tanx + tan y cotxcoty — 1 

6 - tan (x + y) = 1 - tanxtany ; COt (x + y) = ^FTT^T 

7 . A = f, B = 2V3 

8. A = C cos a, B = C sin a 

9 . C = ( A 2 + B 2 ) 112 . If A 2 + B 2 0, choose a so that COS a = AjC , sin a = BjC. 
If A = B = 0, choose any a. 
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A nswers to exercises 


10. 

c = 

2\/2, 

TT 

II 



11. 

C : 

= V 2, 

a = -a/4 



12. 

I* 

+ ftTT 




13. 

b 

+ 2/Irr; 

7T + 2«77 



17. 

(a) 

1 - 

1V3 (b) 1 

" 2 V 2 (c) S 

(d) 1 


00 

KV3 

- A/2) 


18. 

2^ 2 

+ 2 


21. 

2 a/2 - 2 

19. 

1 + 7r 3 /24 


22. 


20. 

0 



23. 

\/3 + tt/6 


(e) 2 (f) 0 (g) 0 


24. 

25. 

26. 1 

27. 1 


V3 + ix + sin x + 77/6 if 0 < x < 2 tt/3 ; 
(x 6 — x 3 )/3 + COS x — cos (x 2 ) 


■ sin x + 5 tt/ 6 if 2 tt/ 3 < x < 


2.11 Exercises (page 110) 


5 . 

4?r 3 /3 


9 . 

877 



13. 

2 


6 . 

7T 


10 . 

77/8 



14. 

377/2 


7 . 

27T 


11 . 

77/2 



15. 

977/2 


8 . 

477 


12 . 

2 






2.13 

- Exercises (page 114) 








1 . 

77C 2 /> 3 /3 

5. 

77 2 /2 


9 . 

377/10 


13 . 

(¥ - 4 a / 3 ) t 7/- 3 

2 . 

7r/2 

6 . 

77 2 / 4 


10 . 

a /2 


14 . 

a = 1 

3 . 

2tt/3 

7 . 

77 2 


11 . 

2W3 


15 . 

16A/3/3 

4 . 

3377-/5 

8 . 

77/2 


12 . 

6 

5 


16 . 

4« 5 /5 

17 . 

(/?! + 4M 

+ $ 2 ) 








18 . 

OO 

(b) 277 

(c) IOtt/3 

(d) 1677/15 





2.15 Exercises (page 116) 


1. 60 ft-lb 

5. 

3750 

2. 125 joules; 0.8 meter 

6. 

5000 

3. (a) 441 joules (b) 425 joules 

7. 

20,000 

4. a = 3, b = — 2 

8. 

21,800 

2.17 Exercises (page 119) 



I. (a 2 + ab + b 2 )/ 3 

6. 

2/77 

2. Ts 

7. 

2/77 

3. 1 

8. 

I/77 

4. 45 

28 

9. 

1 

2 

5. 2/77 

11. c = a/\/ 3; c = a/(« + l) 1 /” 

10. 

1 

2 

12. (a) w(x) = x (b) w(x) = x 2 

(c) w(x) 

= X 3 


14. All three 

16. (a) 1/2 (b) C73 ( C ) L/V3 

17. (a) 71/12 (b) 5L 3 /8 (0 Vl5 L/6 

18. (a) 2L/3 (b) I 4 /4 (c) V2 1/2 _ 

19. (a) 11L/18 (b) 31L 4 /192 (c) a/62 L/12 
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20. (a) 3L/4 (b) L 5 / 5 (c) Vl5 L/5 

21. (a) 211/32 (b) 19L 5 /240 (c) Vm L/2Q 

22. p(x) = x 2 for 0 <. x < L gives x = 3L/4 

23. (a) 6/tt (b) 3_V2/2 

24. T =2tt sec; 80%/ 3 

2.19 Exercises (page 124) 

1. x + x 2 /2 + x 3 /3 

2. 2 y + 2y 2 + 8y 3 /3 

3. f + 2x + 2x 2 + 8x 3 /3 

4. -2x + 2x2 — x 3 

5. (3a: 5 + 5x 3 + 136J/15 

6. x 10 /5 + 2x e /3 - x 5 / 5 - 2x 3 j3 + x 2 - x 

I. x + I x 3/2 - I 

8. §(x 3 - x 3 / 2 ) + t(x 5 '' 2 - x 5 / 4 ) 

9. sin x 

10. \x 2 + sin (jc 2 ) 

II. ix 2 - Jx + COS (x 2 ) - COS x 

12. i(x 3 -COS3X+l) 

13. |(x 6 - x 3 + COS 3x - COS (3x2)) 

14. | y 2 + i y — | sin 2% 

JC 

15. 2 sin - - | cos 2x + \ 

16- f(x + w) + sin x + \ sin 2x 

17. 0, ±V2 

18. (c) P(x) = i(x [x]) 2 - \(x - [x]) (d) A 

20. (b) g(2) = 2A,g(5) = 5A (c) A = 0 


Chapter 3 


3.6 Exercises (page 138) 


1 . 

2 . 

3. 

4. 
22 . 

23. 

24. 

25. 
28. 

29. 

30. 
32. 


A 5. 2t 

- 1 6.-1 
4 7. 1 

1 8 . 0 


9. 0 13. 1 

10. 0 14.-1 

11 . 1 

12 . -1 


a = (sin c — b)jc if c ^ 0; if c = 0 there is no solution unless b = 0, in which case any a 
will do. 

a = (2 cos c —b)lc 2 if c 0; if c = 0 there is no solution unless b = 2, in which case any a 
will do. 

The tangent is continuous everywhere except at x = + nn , where n is any integer; the 

cotangent is continuous everywhere except at x = tin, where n is any integer. 
f(x) ->■ 1 as x -+ 0. Define/(0) = 1 for continuity at 0. 

No 


No 

f(x) — ► 0 a s x -+ 0. Define /(0) =0 for continuity at 0. 
f(x) ->0 as x -+ 0. Define/(0) = 0 for continuity at 0. 
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3.8 Exercises (page 142) 

1 . x 2 1 , allx 

2. (x - l) 2 , all x 

3. jjc|, all x 

4. 0, defined only at x = 0 

5. x,x > 0 

11 , -3 13. 1 15. 1 

12. VJ 14. 1 16. 2 

21. *2 jf* > 0; Oifx <0 

22. 1 if 1 < |.r| < V3; 0 otherwise 

23. *2 if* >0; Oifx <0 


6. - x , x > 0 

7. sin Vx,x >; 0 

B. V sin x, Ik-n < x < (2k + 1 ) 77 , k an integer 
9. J x + V x, x > o 

10. Jx + Vx + Jx + Vx, x > 0 

17. 0 19. 1 

18. 2 20. 1 


3.15 Exercises (page 149) 

1- g(y) =9 1; ally 

2 - g(y) = Uy - 5 ); all y 

3. g(y) = -i; ally 

4. g(y) = y 1 I 3 ; all y 

5. g(y) = jify< 1; Vy if l < y < 16', (yl 8) 2 if y > 16 


3.20 Exercises (page 155) 

3. 0.099 669 rounded off in the sixth decimal place 


Chapter 4 


4.6 Exercises (page 167) 

1. /'(0) = 1 ,/'(*)= 0,/'(l)= — 1, /'( — 10) = -19 

2. (a) 1, -2 (b) 0, -1 (c) 3, -4 

3. 2x + 3 

4. 4x 3 + COS x 

5. 4* 3 sin x + * 4 cos x 

6. -l/(* + l) 2 

7. —2 */(* 2 + l) 2 + 5* 4 cosx * 5 sin x 

8. — 1/(jc - l) 2 

9. sin x/(2 + cos x) 2 

in 2x 5 + 9x 4 + 8* 3 + 3x 2 + 2x — 3 

10 ' (x 4 + X 2 + l ) 2 

1 - 2(sin x + cos x) 

1L (2 - COS x) 2 

sin x + x cos x 2x 2 sin x 
11 iTx 2 (1 + x 2 ) 2 

13. (b) v 0 /32 sec (c) -v 0 ft/sec (d) 16 ft/sec; 160 ft/sec; 167" ft/sec 
(f) f(t) = v 0 t — 10( 2 is one example 

14. 3x 2 , where x is the length of an edge 
16. J*- 1 / 2 
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17 . 


22 . 


23. 


- 1 

28. 

2\/x(l + Vx) 2 

h 112 

-2-x 3 — 5/2 

29. 

i-x- 1 ' 2 + lx- 2 /® + ix- 3 / 4 

-lx" 3 ' 2 “ ix-*i 3 - -IX - 3 / 4 

30. 

1 -x 

31. 

2\/x(l + x) 2 


2 + Vx 

32. 

2(1 + Vx) 2 

sec x( 1 + 2 tan 2 x) 

33 

x sec 2 x + tan x 

34. 


2(1+ x 2 ) 

(1 - x 2 ) 2 

2(1 - 2x) 

(1 — x + X 2 f 

x cos x — sin x 

V 

1 + COS X 


(x + sin x ) 2 

ad — be 
(cx + df 


( 2x 2 + 3) sin x + 4x cos x 


(2x2 + 3) 2 


35 


(2 ax + b)( sin x + cos x) + ( ax 2 *f bx + c)(sin x — cos x) 


1 + sin 2x ^ "1 go 

36. a = d = 1 ; b = c = 0 ^ 

37. a = c =0; b =f = 2; d = - 1 

«x n+1 - (n + l)x" + 1 


38. (a) 

(b) 


(x - l) 2 

n 2 x n+3 — (2 n 2 + 2n — l)x n+2 + ( n + l) 2 x n+1 — x 2 — x 
(x - l) 3 


4.9 Exercises (page 173) 

1. 1, 3 

2. (a) -1,| (b) -b 0 (c) -2, f 

3. (2n + l)7r, where n is any integer 

4 . a = —2, 6=4 

5. a = l,b = 0, c = -1 

6. (a) x x + x 2 + a (b) K*i + x 2 ) 

7. Tangent at (3, -3); also intersect at (0,0) 


3_1 + 4Vx + 5x 
4 V x(x + V x ) 4 

-2 


15. (a) True (b) True (c) False if/'(°) 96 0. Limit is 2f'(a) 
(d) False if f'(a) -A 0. Limit is |/'(a) 


8. m = - 2 , b = -2, a = b c = I 

9. a = 2c, b = -c 2 

3 1 

10. a = — , b = - — -= 

2c 2c 3 

11. a = cos c, 6 =sinc — ccosc 

1 + 3 


12 . - 


1 


Vx(l + Vx) 2 ’ 2(x + Vx) 3 

13. a = -4, b = 5, c = -1, d = 


14. (a) 


15 


(b) 2 (c) i 
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16. (a) D*(f + g) = (1 + glf)D*f + (1 + flg)D*g when fix) and^Qr) are not 0; 

D*(fg)=g 2 D*f+f*D*g; 

D*(flg) = (g 2 D*f - f 2 D*g)lg* when g(x) * 0 

(b) D*f(x) = 2 f(x) Df(x) 

(c) f(x) = c for all X 


4.12 Exercises (page 179) 


1. -2 cos x(l + 2 sin x) 

2 - x/V 1 + x 2 

3. ( 2x 3 — 4x) sin x 2 2x cos x 2 + 2 sin x 3 + 6x 3 cos x 3 

4. -sin 2x cos (cos 2x) 

5. n sin” -1 x cos (n + l)x 

6. cos x COS (sin x) cos [sin (sin jc|] 

^ 2 sin x(cos x sin x 2 — x sin x cos x 2 ) 

sin 2 x 2 

8. 2/(sin 2 x) 

16 cos 2x 
sin 3 2x 


10 . 

11 . 

12 . 

13. 

14. 

15. 

16. 


1 +2x2 

4(4 _ x 2 )- 3 ' 2 
2x2 / 1 + x s \l /3 
1 - jc 6 U - x 3 ) 
-(1 + X 2 )- 3/' 2 


— — 6 , where g(x) = 

&V xg(x)V x + g(x) 

6 + 3x + 8x 2 + 4x 3 + 2x 4 + 3x5 
(2 + * 2 ) 1/2 (3 + x 3 )* 3 
fix) = ( X + l)- 2 ; g’(x) = (2x + l )- 2 


17. x 

h(x) 

h'{x) 

k(x) 

k'(x) 

0 

0 

-10 

0 

5 

1 

1 

5 

1 

12 

2 

2 

4 

2 

-10 

3 

3 

12 

3 

4 

18. x 

<?'(*) 

fix) 



0 

0 

0 



1 

3 

10 



2 

30 

36 
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19. (a) 2 xf'(x*) (c) f'[f(x)]f(x) 

(b) [/'(sin 2 x) — /'(cos 2 x)] sin 2x (d) /'(*)/'[/(*)]/'{/[/(*)]] 

20. (a) 75 cm 3 /sec (b) 300 cm 3 /sec (c) 3x 2 cm 3 /sec 

21. 400 mph 

22. (a) 20V5 ft/sec (b) 50^2 ft/sec 

23. 7.2 mi/hr 

24. (a) and (b) 5/(4 tt) ft/min 

25 . c = 1 + 3677 

26. dVjdh = 7577 ft 3 /ft; drjdt = 1/(1 577) ft/sec 
66 

27. — cm 2 /sec 

28. n = 33 

29. (a) x = 1, y = | (b) Jv 7 3 


4.15 Exercises (page 186) 


3. 

6 . 


7. 


(b) C 

(a) 6 

(b) 6 

(b)f 


= i c = V2 
_ 1 a 1 
— 2 ’ ^ 2 

X + i/i 

= 3 ; 0 - 

X + V* 2 + xb + 3/1 2 

has at most k + r zeros in [a, b] 


if x > 0 


4.19 Exercises (page 191) 

1. (a) | (b) fdecreases if x <|- ; increases if x > | (c) f' increases for all x 

2 • ( a ) ±f\/3(b) f increases if \x > fV 3 ; decreases if W<fV3 

(c) /' increases if x > 0; decreases if x < 0 

3. (a) ±1 (b) f increases if |%| > 1; decreases if |x[ < 1 

(c) f 1 increases if x > 0; decreases if x < 0 

4. (a) 1,3 (b) f increases if x < 1 or if x > 3 ; decreases if 1 < x < 3 

(c) /' increases if x > 2; decreases if x < 2 

5. (a) 1 1 (b) /ticreases if x > 1; decreases if x < 1 (c ) f 1 increases tor all x 

6 . ( a ) none (b) f increases if x < 0; decreases if x > 0 

(c) /' increases if x < 0, or if x > 0 

7. (a) 2 1/3 (b) f increases if x < 0, or if x > 2 1 / 3 ; decreases if 0 < x < 2 1/3 

(c) /' increases if x < 0, or if x > 0 

8. (a) 2 ( b ) /increases if x < 1, or if 1 < x < 2; decreases if 2 < x < 3, or if x > 3 

(c) f 1 increases if x < 1, or if x > 3 ; decreases if 1 < x < 3 

9. (a) ±1 (b) f increases if |jc|< 1; decreases if \x >1 

(c) (’’increases if — \/3 < x < 0, or if x > a/ 3 ; decreases if x < — \' 3, or if 0 < x < 

10. (a) 0 (b) f increases if x < -3 or if -3 < x < 0; decreases if 0 < x < 3, or if x > 3 

(c) /' increases if |jc| > 3 ; decreases if |jc| < 3 

Note. In Exercises 11, 12, and 13, n denotes an arbitrary integer. 

11. (a) J«77 (b) f increases if n7r < x < (n + |)tt; decreases if (n — 5)77 < x < nrr 

(c) /' increases if (« — < x < (n + JV; decreases if (n + 5)77 < x < (n + fV 

12. (a) 2fl77 (b) f increases tor all x 

(c) /' increases if 2nn < x < (2 n + 1)tt; decreases if {In — 1)77 < x < 2m 
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13. (a) (2n + 5)7 t (b) fincreases for all x 

(c) /' increases if (2n + |V r < x < (2 n + decreases if (2 n - |V < x < (In + IV 

14. (a) 0 (b) fincreases if x > 0; decreases if x < 0 (c) f increases for all x 

4.21 Exercises (page 194) 

2. \L ft wide, \L ft long 

3. Width \\'2A, length sj 2A 
7 . V2 L 

10 . r = i/i = R/V 2 

12. r = \R, h = \H 

13. r = 2/?/3, /i = HI3_ 

14. h = $R, r = iVlR 

15. A rectangle whose base is twice the altitude 

16. Isosceles trapezoid, lower base the diameter, upper base equal to the radius 

17. (a) 6|, 6§, f_ 

(b) _ 8 + 2y/l, 2 + 2\/7, 5 - V7 

18. V5 

19. (a) 20V 7 3 mi/hr; $10.39 

(b) 40V2 mi/hr; 816.97 

(c) 60 mi/hr; $22.00 

(d) 60 mi/hr; 827.00 

(e) 60 mi/hr; $32.00 

20. V4 

21. Crease = |\/3 inches; angle := arctan i\/ / 2 

22. (a) max = 3\/ 3 17 min = 4r 

(h) ]/• 

23. Rectangle has base 4PJ(2tt + 8 ), altitude P(A + tt)I(6tt + 16) 

2 4 . v = 48:7 for 0 < h < 2; V = 4V4 + hf/( 9h) for h > 2 
26 A = 2(\-) 7/2 

27. m(t) = 8 if f- > | ; m(t) = / 2 «. J if f 2 < J 


*4.23 Exercises (page 201) 

1 . = 4x 3 - 8 x/; = 4/ - 8 x 2 y; ^{= >2x 2 - 8 /; |^= 12/ - 8 x 2 ; 

_jy _ 

3x3y 3ySx“ 6jr> 

2 . f x = sin(x + y) + x cos (x +y); f v = x cos (x + y); f m = -x sin (x + y)\ 
f xx = 2 COS (x + y) - x sin (x + y); f xy =f yx - COS (x + V) - X sin (x + j) 

3 - Dif y + K-V = -x - *y/: = °; D 2.2.f - 2 ;xr 3 ; 1 - 

4 . f x = x§t"- y ■yK.x- ^ v 2 ) 1,2 ; /« =y 2 (x 2 + y 2 )-3/2. 

/,„ = x V + /r 3/2 ; /*, = U = + /r 3/2 

5. = 6 x 7 cos (x 2 /) - 9x 4 / sin (x 2 /); 

f„ = f„ ~ 6 x/ cos (x 2 /) - 6 x 3 / sin (x 2 /) 

6 . f n = f iJX = 6 cos ( 2 x - 3 y) co S [cos ( 2 x - 3j)] + 6 sin 2 ( 2 x - 3 y) sin [cos ( 2 x - 3y)] 

d 2 f d 2 f 9 2 / d i f 

1 J = 4x,x - ,)-3 

8 , = — 3xy 2 (x 2 + /)-5/2; f yy ;= -X(x 2 - 2/)(x 2 + /)~ 5/2 ; 

/« =fyx - y( 2x 2 - J 2 )(x 2 + /)-5/2 


7 


-2 
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Chapter 5 


5.5 

1. 

2 . 

3. 

4. 

5. 

6 . 

7. 

8 . 

9. 

10 . 

14. 

15. 

16. 

17. 

18. 

19. 

20 . 

21 . 


Exercises (page 208) 
f(6 4 - a 4 ) 

4 (6 5 - a 5 ) + 6 (a 2 - ft 2 ) 
~(6 5 - a 5 ) + {(/>' - a 1 ) 


(6 2 - a 2 ) - 2(6 - aj 


/l 1 

- a 2 ) - - - - 
\6 a 


1 

+ xl 7T 5 


a) + f(6 3 / 2 


a 3 ' 2 ) +W 


(b_ 

V 2(6 3/2 - a 3/2 ) 

f(65/2 _ fl 5/2) _ 2(6 3 /2 „ flS/8) I 
1(64/3 _ a m „ b m + a n 3) 

K6 6 — a 6 ) — 3(cos b — cos a) 
|(6 7 /3 — a 7/3 ) — 5(sin b — sin a) 

/(h) = h; /'(h) =2-7/ 
f(t) = —sin /; c = 77 /3 
f(t) = sin t ■» 1 ; c = 0 
f(x) = 2xi5; c = _1 
p(x) = 3 + h + 

/"( 1) = 2; / '» (1) 

(a) (1 + x 2 )- 3 

2x 13 3x 20 

1 + x 8 1 + x 12 


-« 2 ) 

7(6i/ 2 


1 / 2 ) 


l v 2 

l X 


(b) 2x( 1 + x 4 )- 3 (c) 2x( 1 + x 4 r 


3x 2 (l + xV 


22. (a) 16 (b) 1 + 3 S V2 (c) (36)i/ 3 ( d ) i 

23. /(a) = a(3 - COS a// 2 

24. (a) —77 (b) 1 - 77 (c) 0 ( d ) -77 2 (e) 3 tt/2 

25. (a) 77-1 (b) i (c) i+ h -!)(/- 1) (d) ,J(/ 1)+ (77 J)d D 2 /2 

26. (a) None (b) One example is/(x) = x + x 2 (c) None 

(d) One example is f(x) = 1 + x + x 2 for x > 0 ,/(x) = 1/(1 — x) for x <0 

28. (a) implies a and 6; (b) implies a; (c) implies a and y, (d) implies a and 6; 

(e) implies a, (5, and e, 


5.8 Exercises (page 216) 

1 . J( 2 x + lh 2 + c 

2. (:/V)0 + 3x)5/2 _ (J-)(l + 3x) 3 /2 + o 

3. f(x + 1) 7/2 - f(x+ 1)5/2 + |(X + 1) 3/2 + o 


5 ' “ 4(x 2 + 2x + 2) 2 + C 

12. 2(cos 2 - COS 3) 

COS x” 

6 . £ cos 3 x — cos x + C 

13. - - + c 

n 

7. 3(z-l) 7/3 + f(z - l) 4 ' 3 + C 

14. -jVl - x 6 + c 

8. -fcsc 2 * + c 

15. |(1 + O 9 / 4 - -f(l + /) 5/4 + 

9. | - V 3 

16. x(x 2 + 1)~ 1/2 + C 

1 

17. -i 0 -(8x 3 + 27)5/ 3 + C 

10 . + c 

3 + cos x 

18. f(sin x - cos x) 2/3 + C 

2 

19. 2 V l + Vl + x 2 + c 




628 


Answers to exercises 


5.10 Exercises (page 220) 

1. sin x — x COS x + c 

2. 2xsinx + 2 cos x — x 2 cos x + C 

3. x 3 sin x + 3x 2 cos x — 6x sin x — 6 cos x -j- C 

4. —x 3 cos x + 3x 2 sin x + 6x cos x — 6 sin x + C 

5. i sin 2 x + C 

6. ^ sin 2x — lx cos 2x + C 
15. (b) (5jr/32)a 6 

17. §(3a/ 31 + a/3 - 11.35) 

18. tanx — x; Jtan 3 x — tanx +x 

19. — cotx — x; ~J-cot 3 x +cotx +x 

20. (a) n = 4 (b) 2 


* 5.11 Miscellaneous review exercises (page 222) 


1 ^ 1 / 0 ( 0 ) = 0 if 0 < k <, n — 1 ; £- ,n, ( 0 )= n! 


2 . 

6 . 

7. 

9. 

10 . 
11 . 
12 . 

13. 

14. 

15. 

16. 

17. 

18. 

19. 

20 . 
22 . 

23. 

24. 
26. 


6x 5 — 15x 4 + 10x 3 + 1 

3 

61 

y = 16x 2 /9 

(b) /'( 0) = 0 

— | cos 5x + '25 sin 5x — fx cos 5x 
i(l + x 2 ) 3 '* 

— 3 10 /20 
37/8281 
aV + x*? 

1/265650 

COS COS 1 

[12(x - 1) 1/2 - 24] sin (x - l) 4 ' 4 - 4[(x - 1) 3/4 - 6(x - l) 4 ' 4 ] cos (x - l) 4 ' 4 
l sin 2 x 2 

-§(1 + 3 cos 2 *) 3/2 


a = 9. b = t/ 2 

8 16 128 256 

15) ^.59 gig 1>S3 


_1„ Y 13 
1 3* 


i v i2 + 




27. 1 (1 
2 \2 


+ 2 


4 ) 


34. (a) p(x) = —x 2 + x — 1 

35. (a) P^x) = x - P 2 (x) = x 2 - x + i; P 3 (x) = x 3 - |x 2 + Jx; 

P 4 (x) = x 4 - 2x3 + x 2 - if,,-; P s Hx) = x5 _ |x 4 + fx 3 - fx 


Chapter 6 


6.9 Exercises (page 236) 

1. (a) 1 (b) ( a + b)}(\ + ab ) 

e - 1 (e 2 — l) 2 

2. (a) 0 (b) — (c) 4 (d) 4 ^~ 

3. Increasing if 0 < x < e, decreasing if x > e; convex if x > e 3l2 ; concave if 0 < x < e 3/2 
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4. (2x)/(l + x 2 ) 

5. xj(\ + x 2 ) 

6. x/(x 2 — 4) 

7. l/(x log x) 

8. (2/x) +l/(x: logx) 

9. x/(x 4 -l) 

15. — 1 /(jc log 2 jc) 

16. J log |2 + 3 jc| + c 

17. x log 2 |x — 2x log |x + 2x + c 

18. |x 2 log x — Jx 2 + c 

19. §x 2 log 2 x gx 2 log |x| + jx 2 + c 

20. 3 

21. log |sin x| + C 

y'/l+l 

22. — — log \ax\ - 2 SC i f « ^ 

n + 1 (« + l) 2 

23. y (log 2 |*l - S log H + f) + C 

24. log |log xj + c 

25. - 2 

26. Q(-2 + log |x|)\/l + log |x| + c 

27. y log 3 |x| - -re a: 4 log 2 |xj + - 3 %x 4 log [xj - 

34. 41ogx 

35. 3 + 3 log x. 

36. a log a 


n(x + dl + x 2 )" 

10 . - 7 == — - 

VI +x 2 

11 . 1 /[ 2(1 + Vx + 1 )] 

12. log (x + Vx 2 + 1) 
1 3 . Ilia - bx 2 ) 

14. 2 sin (log x) 


-1 ; | log 2 |«x| + C if n= -1 


_ 3 _ 
128 


X 4 + C 


6.17 Exercises (page 248) 


2.1 

8xe 4a;2 

6. 

7. 

2 3: log 2 
2 4 ^ x log 2 

3. 

-2x<r* 2 

8. 

(cos x)e 8in * 

4 . 

e Vx 

9. 

-(sin 2x)e cosZ * 

2Vx 

10. 

1 


gllX 

11. 

gx e e x 

5. 

-~x* 

12. 

X P^ 

e x e e e e 

13. 

e x (x - 1) + C 



14. 

-e~ x (x + 1) + C 

18. 

-|(x 2 + l)e~ x * + C 

15. 

e x (x 2 - 2x + 2) + C 

19. 

b = e a , a arbitrary 

16. 

-\e~ 2x {x 2 + x + J) + C 

21. 

X*(l + log x) 

17. 

2(Vx - \)e^ x + C 

22. 

1 + (1 + 2x + 2x 2 )e : 


23. 

4*. 

+ 

1 

tc 

24, 

oV”- 1 + ax' l ^a x “ log a + 

a x a“*(log a) 2 


25. 

l/[x log x log (log x)] 



26. 

e x {\ + e 2 *)-i/2 



27 

X x x x> i + log x + (log x) 2 



28. 

(log x)^ log log x + logx j 
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29. 2x _1+log * log x 
(log x) 1 ' 1 

30 "jcl+ T^ i ' a; & ~ 2 ( l0 g X)i + x lo S x lo g dog *)] 

31. (sin *)i+co8* [ cot 2 x _ i 0 g ( S i n x )] _ ( cos ^l+sina; [ tan 2 x _ j 0 g ( cos x)] 

32. *"*W(1 - log x) 

54x — 36x 2 + 4x 3 + 2x 4 

31 3(1 - x) 2 (3 - x) 23 (3 + xfl 3 

n n 

34. TT (x-a^J 

i= 1 k = 


x-a k 


6.19 Exercises (page 251) 

16. | 

17. | 

18. sinh x = cosh x = f| 

19. 3i 12 

12 

20' 4f 

6.22 Exercises (page 256) 


12 . 

13. 


1 


V4 - x 2 

1 


if \x\ < 2 


Vl + 2x ~ x 2 
1 


if Jjc - 1| < V2 


14. 7- 1 — — if Ixl > 1 

\*\Vx 2 -l ' ' 

COSx. 


15. 

16. 

17. 

18. 


COS X 

V X 


2(1 + x) 

1 + X 4 


1 + x 6 


2x 


|x| (1 + X 2 ) 


if x # 0 


2 9 . arcsin n + C 

\a\ 

q n • * + 1 , ^ 

3 0 . arcsin + c 

V2 


19. 


20 . 


2x 


if x ^ (Jc + 


1 


2(1 + x 2 ) 


cos x + sin x 

21. , .. — if kn < x < (k + J)77 


if x ^ (k + k an integer 
if x > 0 


22 . 


V sin 2x 
x 


|x| Vl - X 2 


if 0 < |x| < 1 


23. 1/(1 + x 2 ) if x ^ 1 

4x 

24. if |x| < 1 

Vl ™ x 4 (arccos x 2 ) 3 


25. 

27. 


2x \ 'x - 1 arccos (l/Vx) 

3x + ( 1 + 2x 2 ) arcsin x 
(1 - x 2 ) 2 (1 - x 2 f 2 


if x>l 


31. - arctan - + C 

a a 

1 la 


3 2. - - arctan / - x ) + C if a b > 0 ; 


a\l b 


2 |a|V -ab 


log 


VJa\ + xV\b\ I 


V|a| - xVlf'l I 


C if a b < 0 
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2 2x - 1 

33. ^arcan— ^- + c 

34. |[(1 + x 2 ) arctan x — x] + c 

x 3 2 + x 2 , 


35. — arccosx — 


Vl -X 2 + c 


36. -|(1 + x 2 )(arctan x) 2 - x arctan x + j log (1 + x 2 ) + C 

37. (1 + x) arctan v'x - Vx + C 7„ rrt „ n x 

38. (arctan Vx) 2 + C ^ 2\ arctan x 

39. i(arcsin x + xsf 1 - x 2 ) + C 43. arctan 

(X _ lj^retan * 


43. arctan e x + c 


2Vl + x 2 

(x + l)e arctan x 

41. .- — + c 

2Vl + x 2 


47. i \b — a\ (b — a) arcsin J — 

6.25 Exercises (page 267) 

1. log |x — 2| + log |x + 5| + 

, , {x+2f I „ 


arccot e x 

44. I log (1 + e~ 2x ) - S C 


45. a arcsin - — \/ a 2 — x 2 + C 

a 

2 (b — a) /x — a 

46. -T 7 r arcsin 1- C 

\b - a\ V b - a 

lV (x - a)(b - x) [2x - (a + b)] + c 


2. a log 


(x + l)(x + 3) 3 + 


3 ' 3(x - 1)+ ° l0!; x +2 


4. ^x 2 — x + log 


x 3 (x + 2) 


5 - lo sl* + 1 l -(^TIT 2+ 2TTT + c 

6. 2 log |x — l| + log (x 2 + x + 1) + c 

7. x + i arctan x - § arctan (x/2) + c 

8. 2 log |xj — log [x + 1 1 + c 

9. log |xj- |log(x 2 + 1) + C 

9x2 + 50x + 68 (x + l)(x + 2) 16 

10 - 4(x + 2)(x + 3) 2 + 8 log (x + 3) 17 

1 i 

11. 7 + log x + 1 + c 

X + 1 61 1 

12. Jlog |x 2 — 1 — log |x| + c 

13. x + flog [x - 2| - flog |x + 3| + c 

4 

14. log |x — 2| — + c 


15. - arctan (x - 2) + c 

2 - x 
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16. 4 log |x + 1 1 

x + 1 

17. }'0S —[ 


S log \x\ - 
x 

2(x 2 - I) 


| log \x + 2\ + c 
+ C 


18. i log 


(r - D 2 
X 2 + x + \ 


+ c 


19 - lo sW+^TT 


+C 


20 . 


4x + 4x 2 + 8 


X - 2 

3 y— 


-1 c 


21. log 


Vl + X 2 


x + arctan x + C 


22. log |(x - l)/(x + 1)| ~ \ arctan x + C 

l 


23. 


4\/2 


log 


x 2 + xV 2 + 1 


- xV2 + 1 2\/2 


xV 2 

arctan + c 


1 -x 2 


24. (x 2 + 2x + 2) _1 + arctan (x + 1) + C 

25. - xj{x 5 + x + 1) + c 

1 1+3 tan (jc/2) 

26. —7= arctan 7= + C' 

V5 V5 


x/\ — a‘ 

1 


arctan 


Va 2 - 1 


log 


27. 


28. 


29. 


30. -r arctan 7 tan 

ab \b 

COS x 


1 — a * 

tan - 

1 + a 2 


+ C 


+ COS x + V a 2 — 1 sin x 
1 + a COS x 


s c 


|\/ 2 arctan (V2 tan x) + C 


') 


a(a sin x + b cos x ) 
32. (77-/4) - 1 log 2 


+ C 
+ C 


33. 


\xx/ 3 


x 2 + f arcsin - 7 = I + C 


V 3 - ^ + V3 

x ') 


+ C 


34. — V 3 - .x 2 + C 

35. -\/3 - * 2 - a/ 3 log | 

36. V x 2 + x + \ log (2V x 2 + x + 2x + 1) + C 

37. \xx/ x 2 + 5 + I log (x + x/ x 2 + 5) + c 

38. VV 2 + x + 1 - | log (2x + 1 + 2xi x 2 + x + 1) + C 

39. log (2* + 1 + 2vV + 1 ) + C 

V2 — X — X 2 V2 (x/ 2 — X — X 2 V2 

40. 


x‘ V2 lx/2 — x — x 2 

+ x'°4 — 1 — 


— arcsin 


2x + 


-) 


+ c 
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6.26 Miscellaneous review exercises (page 268) 


1. f(x) + /(l/x) = !(logx) 2 

2. f(x) = log a/ 3/(2 h- cos x) 
4. 1 


5. (a) -A 

6. (a) x > 1 


(b) 


\x(x + l)(x +2) ' 


(c) F(ax) - F(a); F(x) 


e x 

- + 

X 


e; 



7. (a) No such function (b) — 2 X log 2 (c) \x ± 1 

9. (a) g{ 3x) = 3e 2x g(x) (b) g(nx) = ne< n ~ 1>x g(x) ( c ) 2 (d) C = 2 

10. f(X) = b xla g(x), whereg is periodic with period a 

12. (a) -Ae~ a (b) \A (c) A + 1 - \e (d) e log 2 - A 

13. (b) c 0 + nq + «(n - l)c 2 + n(n — l)(n — 2)c 3 

rn m , » 

(c) Cp(x) =^c k x k , then/ (,l) (0) = ^ ! (g) c * 

k=0 k=0 ' ' 


16. 

(a) 

|x 2 (x+ |x|) 


1 1 1 




(b) 

X ™ 

VI 

AT 

fei 

CO 

H 

«-H[CO 

■ - \x \x\ 

1 \x\ 

+ 2— 
6 x 

if |xj 

> 1 


(c) 

1 — 

e~ x if x > 0; e 3 

' - 1 if 

x < 0 




(d) 

X if 

|x[ < 1; |x 3 + 

1— If 
3 x 

|x| > 1 



17. 

f(x) 

= VC2x + 1)/ti 





18. 

(a) 

*0 

- e- 2t ) (b) H 1 

- e~ il ) 

(C) \r, 

f[l — e 

2( (2/ + 1)] 

19. 

(a) 

log3 

-2 log 2 (b) 

No real x 

exists 



20. 

(a) 

True 

(b) False 

(c) True 

(d) 

False if x < 0 

25. 

(d) 

JV 

~*t n dt = n\e~ x ie x - 

B 

-2* 






»/ Ul 


K=0 




27. 

(a) 

f(t) = 

= 2 Vf - 1 if t > 

0 





(b) 

m 

= / - ir 2 + | if 0 

< / < 1 





(c) 

fa) = 

= t - i/» + i if I 

/I < 1 





(d) 

m 

= < i f t < 0; f(t) = e* — 

1 i f 

/ >0 


28. 

(b) 

- 

ifcl 2 

(c) b = 

log 2 

(d) e 2 

Li (e 2x 2 ) 

29. 

g(y) 

= - 

all y 






30. (b) constant = f 


Chapter 7 


7.8 Exercises (page 284) 
55V2 


8. (b) 


672 


+ R. where < 


V2 

7680 


< 2 • 10- 4 


9. 0.9461 + R, where \R\ < 2 • 10“ 4 
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7.11 Exercises (page 290) 


1 . 1 + x log 2 + \x 2 log 2 2 

2 . cos 1 + (cos 1 - sin !)(* - 1) - 1(2 sin 1 + cos \)(x - l) 2 + i(sin 1 - 3 cos \ )(x - l) 3 


3. 

4 

X — 

„ x 3 x 4 59 x n x 6 

* ~ ~6 + T “ 120 + "8 

a = 0, b = 1, c = -J 







5. 

-f 

10. 1 

15. 

1 

6 

20. 

- 2 

2 5 

. -e/2 

6. 

ajb 

11. 1 

16. 

-1 

21. 

i log a 

26. 

e — 1/2 

7. 

2 

3 

12. log a/log b 

17. 

-1 

22. 

1 

6 

27 

<,1/6 

8. 

1 

6 

13. 1 

18. 

1 

6 

23. 

lie 

28: 

1 

2 

9. 

30. 

ii 

rHi-N fTJ 

14. | 

2; limit = f 

19. 

1 

24. 

e z 

29. 

1 

2 


33. /( 0) = 0; /'( 0) = 0; /"( 0) = 4; limit = e 2 


14 

-2 


7.13 Exercises (page 295) 
1 . 

2 . 

3. 

4. 

13. 

14. 

15. 


5. 

6 . 
7.. 


6 a s x - 0 ; 4 /tt as * ->■ w/2 

a = -3; b = f 

a = 4; b = 1 

16. (a) T(x) = tan 

17. tE\L 

At cos kt 

2 k 


X — I sin x 


18. 


(a/bf 

x 

6 

1/V2fl 
- 2 

(b) S(x) = \x 


9. 1 

10 . - i 

1 1 . n(n + l)/2 

1 2 . \(a 2 — b 2 )l(a 2 b 2 ) 


£ sin x (c) | 


7.17 Exercises (page 303) 


l. 

0 

8. 

e/2 

15. 

0 

22 . 

1 

2 . 

1 

9. 

+ °o 

16. 


23. 

e e 

3. 

1 

3 

10 . 

1 

17. 

- 1 

24. 


4. 

1 iVb 

11 . 

0 

18. 

1 

25. 

1 

2 

5. 

24 ' 

12 . 

0 

19. 

e 

26. 

log 2 

6. 

1 

13. 

1 

2 

20 . 

1 

27. 

1 

2 

7. 

0 

14. 

0 

21 . 

1/e 

28. 

II 

29. 

1 

2 







30. 

c = 1; limit = 

iV3 






32. 

(b) 11.55 years 

(c) 

11.67 years 






7 

5 


8.5 Exercises (page 311) ** 


1. 

11 

CO 

Si 

1 


2. 

y = §* z + 

1 v 5 


3. y = 4 COS x — 2 cos 2 x 

4. y = * 2 - 2 + 2<r* 2 / 2 

5. .c = le 2t + §e~‘ 

6. y = (x + C)l sin x 


Chapter 8 

7 - 

8. y = sin x + C/ sin x 



10. y = XT(X) + cx 

11. f(x) =1 + log* 

12. Only the function given 
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14. 

y = 

(V 2e ix + e lx - e x f 

15. 

y = 

1 /(x 2 - x + 2 - e~ x ) 

16. 

y = 

(. x 3 - X? 

17. 

y = 

1 l(x 2 + x — x 2 log x) 



te x + e 2 ~ x \ 112 

18. 

(a) 

H 2, ) 



Ce 3x +2 

20. 

(a) 

y = — ; With 

o 3x - 1 

8.7 

Exercises (page 319) 


(b) 


e x + e 2 ~ x \ llz 
2* / 


(c) / 


sinh x 
x 


b + 2 
b - 1 


(b) y 


e Zx + 2C 

- e 3x _ c 


with 


b - 1 
b + 2 


1 . 

2 . 

3. 

4. 
6 . 

7. 

8 . 

9. 

10 . 
13. 

15. 

16. 

17. 

18. 


100(1 "■ 2 1 ^ 16 ) = 4.2 percent 
Four times the initial amount 
(a) T = (log n)jk (b) \v(t) = (b - t)l{b - a) 

256(1 - e~ tl% ) if 0 < t < 10; 16 + 166s 20 - 2 ' if t > 10 
v ->- V mgjk 

(c) 54.5 min (d) T = [1 + (600 - t)k + (1400k - l)e~ kt ] 

55° 

19.5 lb 
54.7 lb 

For Equation (8.20), x = jc 0 e s(t_ ‘«>; for Equation (8.22), a = Mk 


X = M [ 1 + exp(— M 1 k(u) du ) 

\ J 1 0 / - 


-1 


(a) 200 million (b) 217 million 

(a) 0.026 per year (b) 0.011 per year; 260 million; 450 million 
dx/dt = kx{ 1 — at)] x= x 0 e k ^~ at2/ ^ ; curve (d) 


8.14 Exercises (page 328) 


1 . y = c x e 2x + c 2 e~ 2x 

6 . y = + c 2 e~ Zx 

2. y = Cj cos 2x + c 2 sin 2x 

7. y = ^(c-j cos a: + c 2 sin X 1 

3. y = Cj + c 2 e te 

8 . y = ^(fj cos 2x + e 2 sin 2x) 

4. y = Cj + c 2 e“ 4x 

9. y = 4- c 2 x) 

5 . y = cos V2X + c 2 sin \'2x) 

10. y = e x (c 1 + C 2 a) 

11. y =1 -ie- 3x ' 2 



12. y = -cos(5x - 15) 

13. y = ^ e 6(x-1) + ^ where a = 2- \ r 5, b = 2 + V 5 

14. y = 2e -2x (C0S X + sin x) 

15. u(x) = | e “ x ~ " sin 5x; v(x) = fe -2x ~ 17 sin 3x 

16. «(x) = 6(e ix — e _x )/5; v(x) = e x — e~ 5x 

17. k = n 2 w 2 ; / /; (x) = Csin mx ( n = 1, 2, 3, . . .) 

19. (b) No (c) If k it 0 the condition is Cly flo 5^ 

20. (a) / -y = 0 

(b) y” — 4_y' + 4y = 0 

(c) y” + y’ + f y = 0 

(d) y” + 4y = 0 

(e) y” -y = 0 
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8.17 Exercises (page 333) 

1 . y = qe 1 + qe _x - x 

2. y = c x e x + C 2 — 2x - x 2 — Jx 3 

3. y = c x e~ x + c 2 + ix 3 

4. y = e x (c 1 cos \/2x + q sin V 2x) - - 2 % + f x + §x 2 + |x 3 

5 . y = qe* + c 2 e ix + - 3 % + £x + jx 2 

6 . y = qe 2x + qe :te — + -.U — x 2 — Jx 3 

7. y = (q + jx)e 2x + qe -2 * 


8. 

y = Cj cos 2x + q sin 2x + 

1 ,,—2x 

9. 

y = q<r 2ir + (q + ^x)e x 

10. 

y = qe~ 2x + c 2 e x + \e 2x 


11. 

y = qe -2 * + (c 2 + \x)e x + 

l e 2x 

12. 

y = (q + qx + }x 3 )e x + x • 

t- 2 

13. 

y = (q + qx ™ lo g \x\)e~ x 



14. y = q sin x + (q + log |csc x + cot x\) cos x — 2 

15. j = qe* + qe - * + (e x — e~ x ) log (1 + e x ) — xe x — 1 

16. y = (q + 2 x)e x + %e~ x + qe~ 2 * — i — J(e* + e -2 *) log (1 + e x ) 

f(c 1 -f c 2 x)e~ Sx if x < 1 or x > 2, 
y ((a + bx)e~ 3x + 9 if 1 < X < 2 

18. y = qe 3x + qe -3 * + |xe 3x 

19. y = (q — ix) cos 3X + ( c 2 — ~^g) sin 3X 

20. y = (q — -|x) cos x + c 2 sin x 

21. y = q COS X + (c 2 + |x) sin x 

22. y = q COS 2x + q sin 2x + x cos X + f sin x 

23. y = q COS 2x + c 2 sin 2x + x sin x — I COS x 

24. y = q + c 2 e 3x — fe 2 *(3 sin x + cos x) 

25. y = q sin x + q cos x + -^e 2x (3 sin 3x — cos 3x) 


8.19 Exercises (page 339) 

1 . 2V2 

2 . ±140tt 

3 . A = C,m=k,P = x — ^ 
4. y = 3 COS 4ttx 

5- c = (yl +_;; 2 )i/2 
6 • y = J\/6, y" = - 12 y = -W 6 

7TX 

7. y = -A sin — , where A is positive 


8 . /(?) 


r,;r 


sinX+ 1 
sin 


COS t if o < t < 2n, 
>2tt 

9. (a) \KlWl) (b) R < V2 


10. r(t) = \gt 2 — ct + c(t 


v \ kl\ 

-w) 


. . . kt 

1 1 . r(t) = Ct + C - — 1 1 log 1 — — 


12. r(t) = — ” log 


W - kt 
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8.22 Exercises (page 344) 

!• /+§=0 

2. y' + 2y = 0 

3. yy' -X = 0 
xy' +y = 0 
2xy' - je 0 
(x 2 - y 2 — 1 )/ - 2xy 
(x 2 + 2xy — / •• 2)y 


4 

5. 

11 . 

12 . 

14. 

15. 


6 . 

7. 

8. 
ft 

10 . 


0 


(x 2 _ / _ l)y — 2xy = 0 
(x — 1 )y' -xy = 0 

(x 2 — 4)/ y= 0 
y' + jtan x = 0 

/ 1 = 0 


yf\~— ~x?y' + 


./-2xy+x 2 +2=0 


x + y = -1 is both an integral curve and an isocline 
y = cx + C 2 ; envelope: y = — |x 2 


8.24 Exercises (page 347) 


1 . / = f x 4 + C 

2. cos x = Ce llcoa v 

3. y(C + log |x + 1|) = 1 

4. y - 2 =C(y - l)e x 

5 . / + 2Vl — a: 2 = C 

11. (y + l)e~ tv =e x (cosx — sin x) +C 

12. x 2 - 1 = C(y 2 + 1) 

13. f(x) = 2c*- 1 


6 . y = C{x - l)e x 

7. arctan y + arcsin x = C 

8 (1 + /)( 1 + x 2 ) = Cx 2 

9. y\x + 2) = C(x - 2) 

10 . 1 + f = Cx 2 e xi 


14. f(x) = \/ 5x 2 + 1 

15. f(x) = — log(l+ x 2 ) 

16. f(x) = ±1 ; f(x) = sin (x + C); also, those continuous functions whose graphs may be 
obtained by piecing together portions of the curves y = sin (x + C) with portions of the 
lines y = ± 1. One such example is fix) = — 1 for x < 0 ,/(x) = sin (x — for 0 <x < 3tt, 
f(x) = 1 for x > 3w 

17. f(x) = c 

18. f(x) = A^ c 

19. f(x) = 0 

20. f(x) = 0 


8.26 Exercises (page 350) 

2. x 2 + / = c 

3. y = xlog [Cx | 

4. x 2 + / = Cx 4 

5. / = C(x 2 + y 2 ) 3 

6. x 2 + 2 Cy = C 2 , C > 0 

7. y(Cx 2 - 1) = x 

X 

8. arctan - + log |y| = C 


„ y x y J 

9. - - - + log — = C 

x y x 

10. tan^- = Cx 

2x 

11. (x + yf = Cx 4 / 


8.28 Miscellaneous review 

1. 3x — 2 y = C 

2. x 2 - / = c 

3. x 2 + / _ Cx + 1 : 

4 2x2 + / = c 

5. 2/ — x 2 = c 

6. / = x + C 


exercises (page 355) 

7. 

a 

o ft 

10 

11 . 


xy = c 

/ — log (sin 2 x) = C 
(x - C) 2 +/ = C 2 -1 
x 2 +/ -C(x +/ +2=0 

y = -2xlogx 




638 


Answers to exercises 


12. y = — — x log x 

13. f ( x ) = Cx n ,orf(x)= Cx’/" 

14. f(x) = Cx n ^, or f{x) = Cx 1 6 2n) 


16. y =|(1 '); b=Y e ~ 3 

17. y = -6x 2 + 5x + 1 

18. 59.6 sec 


20. — — — sec, where R is the radius of the base and h is the height of the cone (in feet) 

21 . y = e x 

22. y 3 = — Jx + Cx -1 / 2 for x > 0, or y a = —lx for all x 

23. m = -1; f log |j| = \e~ 2x -f Cy 2 

24. (a) a = 0, b = \ (b) /(x) = 2X 1 ' 2 

25. (b) y = e ix — e~ x 3 ( 3 

26. (a) 1/(1 + 1) grams in t years (b) 1 gm _1 yr 1 

27. [1 — ^(2 — \/2)/] 2 grams in f years; 2 + s/l years 

28. (a) 365e~ 2 - 65 ' citizens in t years (b) 365(1 — e~ 2 - 65t ) fatalities in 1 years 

29. 6.96 mi/sec = 25,056 mi/hr 

30. (a) Relative minimum at 0 (b) a = f, b = 2 9 ° (d) § 

31. (b) Minimum (c) | 
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9.6 Exercises (page 365) 

1. (a) 2i (b) -/ (c) i - \i (d) 18 + / (e) -§ + fi (f) 1 + / 

(g) 0 _ (h) 1 + / 

2. (a) Vl (b) 5 (c) 1 (d) 1 (e) \/2 (f) ^65 

3. (a) r = 2, = f V (b) r == 3, 6 = -\n (c) r = 1, 0 -_n ( d) r = 1 , 6 = 0 

(e) r = 2\\e = 5^/6 (f) r = 1, 6 = ^ (g) r = iVl, 0 = fr 

(h) r = 2 v X 2, 0 = (i) r = \yj 2, 0 = -J77 (j) r = fi = -i w 

4. (a) y = 0, x arbitrary ( b ) x > 0, y = 0 (c) All x and y (d) x = 0, 

y arbitrary; or y = 0, x arbitrary (e)x = l,_y=0 (f) x = l, y = 0 

9.10 Exercises (page 371) 

1. (a) / (b) -2/ (c) -3 (d) 1 (e) 1 + i (f) ( 1 + i)fV: 2 

(g) V2 i (h) -/ 

2. (a) y = 0, x arbitrary (b) x = y = 0 (c) x = 0, y = {In + 1)77, where n is any 

integer (d) x = 1, y = ^ + 2m, where n is any integer 

3. (b) 2 = 2mi, where n is any integer 
6- c_ k = \{a_ k + ib_ k ) fork =1,2 

8. (c) g\/3 + li, — Jv 7 3 + J/, -/ 

(d) a + bi, -a bi, -b + ai, b ai, where a = \ Jl + V2 and b = \Jl s/l 

(e) a -1 bi, -a + bi, b + ai, -b ** ai, where a and b are as in (d) 

11. (a) 1 , e~ nl2 , e~* (c) —w < arg(zj) + arg (z 2 ) < u 

13. B = A/(b — w 2 + acu/) 
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Chapter 10 


10.4 Exercises (page 382) 


1. (a) 

Converges 

(b) 

0 

12. 

(a) 

Converges 

(b) 

i 

3 

2. (a) 

Converges 

(b) 

-1 

13. 

(a) 

Converges 

(b) 

0 

3. (a) 

Diverges 



14. 

(a) 

Converges 

(b) 

0 

4. (a) 

Converges 

(b) 

1 

*5 

15. 

(a) 

Converges 

(b) 

0 

5. (a) 

Converges 

(b) 

0 

16. 

(a) 

Converges 

(b) 

0 

6. (a) 

Diverges 



17. 

(a) 

Converges 

(b) 

e 2 

7. (a) 

Converges 

(b) 

0 

18. 

(a) 

Diverges 



8. (a) 

Diverges 



19. 

(a) 

Converges 

(b) 

0 

9. (a) 

Converges 

(b) 

1 

20. 

(a) 

Diverges 



10. (a) 

Diverges 



21. 

(a) 

Converges 

(b) 

0 

11. (a) 

Converges 

(b) 

0 

22. 

(a) 

Diverges 




23. N > l/ e 

24. N > 1/e 

25. N > 1/e 

26. N > 1/e 

27. N > V2 T* 


28. N > 


log e 

log (9/10) 


34. (c) Let S n 


L t 

b — a v* / 

—2 / » 

fc— 0 


4- k - 


b — a 


, and define /„ similarly as a sum from 1 to //.Both 


sequences {s. n } and {/„} converge to the integral J'&/(x) dx. 


10.9 Exercises (page 391) 

22. (a) 1 (b) 2e — 3 (c) e + I 

23. (b) 5 

24. (a) Identical (b) Not identical (c) Not identical (d) Identical 


*10.10 Exercises on decimal expansions (page 393) 
1. A 

V 9 
Z. 5 1 

•j 99 

3 . 2 00 


5. 


41 

m 

i 

7 


10.14 Exercises (page 398) 


1. Divergent 

8. 

Convergent 

2. Convergent 

9. 

Divergent 

3. Convergent 

10. 

Convergent 

4. Convergent 

11. 

Divergent 

5. Convergent 

12. 

Convergent 

6. Convergent 

13. 

Divergent 

7. Convergent 

14. 

Convergent 


15. Convergent for s > 1; divergent for 5< 1 

16. Convergent 

17. Convergent 

18. Convergent 
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10.16 Exercises (page 402) 

1. Convergent 

2. Convergent 

3. Convergent 

4. Divergent 

5. Divergent 

6. Divergent 

13. Convergent 

14. Convergent if 0 < r < 1, 


7. Divergent 

8. Convergent 

9. Convergent 

10. Divergent 

11. Convergent 

12. Divergent 

when X = k-rr, k any integer 


10.20 Exercises (page 409) 

1. Conditionally convergent 

2. Conditionally convergent 

3. Divergent for j < 0; conditionally 

s > 1 

4. Absolutely convergent 

5. Absolutely convergent 

6. Absolutely convergent 

7. Divergent 

8. Divergent 

9. Divergent 

10. Conditionally convergent 

11. Absolutely convergent 

12. Divergent 

13. Absolutely convergent 

14. Absolutely convergent 

25. Divergent for s <, 0; conditionally 

s > 1 

26. Absolutely convergent 

27. Absolutely convergent 

28. Divergent 

29. Absolutely convergent 

30. Absolutely convergent 
3 1. Absolutely convergent 

32. Absolutely convergent 

33. z = 0 

34. All z 

35. All 2 satisfying \z\ < 3 

36. All z 

37. All z except negative integers 


convergent for 0 < s < 1 ; absolutely convergent for 

15. Divergent 

16. Absolutely convergent 

17. Absolutely convergent 

18. Absolutely convergent 

19. Conditionally convergent 

20. Conditionally convergent 

21. Divergent 

22. Conditionally convergent 

23. Divergent 

24. Conditionally convergent 

convergent for 0 < s < 1; absolutely convergent for 

38. All z ^ 1 satisfying |z| 1 

39. |z| < e -1 / 85 

40. All z 

41. Allz 0 satisfying 0 < \z — l| II 

42. All z ?£ -1 satisfying |2z + 3| < 1 

43. All z = x + iy with x > 0 

44. All z satisfying |2 + l/z| > 1 

45. All z satisfying (2 + l/z| > 1 

46. All z 5* 0 


47. 

48. 


|Jt — k-rr 
lx — kir 


<, w/4, k any integer 
7 t/6, k any integer 


10.22 Miscellaneous review exercises (page 414) 

1. (a) 0 

(b) Converges if c <, 1; limit is 0 if c < 1; limit is 1 if c = 1; diverges if c > 1 

2. (a) 1 (b) The larger of a and b 

3- 5#1 + 3 a 2 

4 • + Vi) 

5. 0 

7. Divergent 

8. Convergent ifs < divergent ifs ^ 

9. Convergent 
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10. Divergent 

11. Divergent 

14. c < 3 

15. a 2 3 

a + 1 

17. When a > -1, limit is — — ; when a < -1, limit is 0 

a + 2 


10.24 Exercises (page 420) 


1. Divergent 

2. Convergent 

3. Convergent 

4. Convergent 

5. Convergent 

11. C = 2 1 integral has value | log f 

12. C = J; integral has value \ log ^ 


J _ 

13. C = \\/2 ; integral has value log \/2 


14. a = b = 2e - 2 


15. a = 1; 6 = 1- 


V 3 


16. (b) Both diverge 

17. (c) Diverges 


6. Convergent 

7. Convergent 

8. Convergent 

9. Divergent 

10. Convergent ifs > 


Chapter 11 


11.7 Exercises (page 430) 

1. Y = 2; convergent for |z| < 2 

2. r = 2; convergent for \z\ < 2, z 5^ 2 

3 . r = 2; convergent for |z + 3| < 2, z ^ - 1 

4. r = |; convergent for |z| < | 

5. r = convergent for |z| < g 

6. r = e; convergent for z| < e 

7. Y = 1; convergent for z + 1| < 1 

8. r = + co 

9. r = 4; convergent for |zj < 4 

10. r = 1; convergent for |z| < 1 

11. r = 1 

12. r = 1 je 

13. r = +ocifo = kir, k an integer ; r = 1 if a ^ kn 

14. r = e~ a 

15. r = max (a, b) 

16. r = min (1/a, 1/6) 

11.13 Exercises (page 438) 


1. 

M < i; 

1/(1 + X 2 ) 

2. 

|x < 3; 

1/(3 - x) 

3 . 

M < i; 

xl( 1 - xf 

4. 

\x\ < l; 

-xli 1 + x)' 


1 ; divergent if i < 1 
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5- \x\ < b - 


1 


log (1 + 2x) 


+ 2x 2x 

6. -J < x < §; -log (1 - 2x) 

2 x 

1. -2 < x < 2; - arctan - 

— x 2 


8. All x; 

9. A1 lxr 3 (e x - 1 


\x 2 ) if * * o, 0 if* = 0 


10. All x; 

22 . - 


(x - l) 2 
V2 2 97 


if X 1 : I if x = 1 


98! 


23. o 0 = 4\/2, a, = 0, a 2 = 5v 7 2, a 3 = 0, a 4 = ^^2 


11.16 Exercises (page 443) 


1 . a. 


(« + 3 )(« - 2) 


2. 


' i+2 = (« + 2)(» + 1) fl " f0r " - ° ; f {x) = 1 " 3 * 2 

10 




a n for n > 0; /(*) = lx 


(fl + 4)(n - 3) 

“ n+2 (» + 2)(/i + 1) 

3. All x 

4. All X 

5. A 1 lx ; a = 1 , b =0 

6. All x; f(x) = e x 2 

7. All x; f(x) = I 

8. All jc; f(x) = cos 2x 

9. All x; f(x) = * + sinh 3x 

12. y = 1 + * + x 2 + |* 3 + . . 

13. y = x + ix 4 + -&X 7 + iff o at 1 ® ■*•••• 

] 4 V ~ 1 v2 _i_ -l.y5 + y*8 I u _ yll [ 

i 1 -*. J 2 A ' 12 1 * 060 a ‘ 8800^ ' 

oo 

2 <x n x n 

— 

71=0 


^3n 

Z <2 3X5 6) [(3/i - 1) (3n)] 

+ Cj ( X + 

oo > 

V ( — 1)”jc 2 " 


Ifi y = r 0 1 + 


r 3«-t-l 


17. y = c 0 1 + 


X—i 2 • 4 • • ■ (2n ) 

n— 1 


-Ir 


k=i 


(3 4X6 ■!)■■■ [(3n ) ■ (3/7 + 1)] 


«=i 


3 • • ■ (2/7 - 1) 


18. a, = -1, a 2 = 0, a 3 = f; /(x) = (x + \)e 


-2x 


19. « 5 = 0, t/ 6 


; f(x) 


sin * cos x — 1 


8! x 

20. (c) V2 = 1.4142135623 

21. (b) V3 = 1.732050807568877 


if I/O; /(0) = f; /( ff ) 


-2/tt 2 
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Chapter 1 2 


12.4 Exercises (page 450) 

1. (a) (5,0,9) (b) (-3,6,3) (c) (3, -1,4) (d) (-7,24,21) (e) (0, 0, 0) 

5. x = J(3c x - c 2 ), >■ = l(2c 2 - Cj) 

6. (a) (x + z,x + y + z, X +y) (c) x = 2, y = 1, z = -1 

7. (a) (x + 22, x + y + z, X + y + z) (b) One example: x = -2, y = z = 1 

8. (a) (x + Z, X + y + z, x + y, y) ( c ) x = - 1 , y = 4, z = 2 

12. The diagonals of a parallelogram bisect each other 


12.8 Exercises (page 456) 

1. (a) -6 (b) 2 (c) 6 

2. (a) (A B)C = (21,28, -35) 


(d) 0 (e) 4 

(b) A (B + C) 


(d) A(B- C) = (30,60, -105) (e) A/(B ■ C) , 

5. Oneexample: (1. -5, -3) 

6. One example: x = -2, y = 1 

7. C = f( — 1, -2.2), D =1(22, -1, 10) 

*• C=A<1,2.3,4,5W> = (2 

9. (a) \/ 74 (b) \/ 14 (c) V53 (d) J 

10. (a) (1, -1) or (-1, 1) (b) (1, 1) or (-1, -1) 

(d) ( b , -a) or (-b, a) 


64 (c) (A + B) C 

2 4 — 7 \ 

I5’l5 ’1? 


72 


ll. (a) 


1 


V 42 


(4, -1,5) (b) 


1 


V 7 14 


(c) (3,2) or (-3, 
(-2, -3, 1) (c) — = (1, 0, 1) 


- 2 ) 


<d) yl - u (e) vS - 5 ’ 4) 

12. A and B. C and D. C and E, D and E 

13. (a) (2, -1) and (-2, 1) (b) (2, 1) and (-2, -1) (c) (1, 2) and (-1, -2) 

(d) (1,2) and (-1, -2) 

14. One example: C = (8, 1, 1) 

15. One example: C = (1, -5, -3) 

16. P =§1(3,4), Q = U -4, 3) 

17. P = |(1, 1, 1, 1), Q = i(-3, -1, 1, 3) 


18 . ±—( 0 , 1 , 1 ) 

V 2 

20. The sum of the squares of the sides of any parallelogram is equal to the sum of the squares 
of the diagonals. 

22. 4 ; 12V2 

1 / 7 2-5 — 14 \ 

23. C =-i-(l,2, 3, 4, 5), D =n( 10 > 2 ’ 3 ’T ,- r) 

24. C = tA, D = B - tA, where / = (A B)I(A . A) 


12.11 Exercises (page 460) 

1 \ l B 

2. IB 
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3. 

5. 

6 . 
8 . 

9. 

10 . 
14. 

17. 

18. 


6 3 ->2 /6 3 -2 

( a ) y ’ y ’ *7 1 6 ) I tj » tj i j 

n a/— a/IH 

U > V 41 , V 41 

7tt/8 

t r/6 
0 

(b) Equation holds for all x and y if cos 0=1; if cos 6^1 the only solution is x = y = 0 
All except (b). 

(c) All except Theorem 12.4(a). 

(a) All 



12.15 Exercises (page 467) 


1. 

2 . 

3. 

7. 

9. 


(a) x =y 
(d) x = 1, 


_ l 
~ 2 


(b) x 


y 


= 6 


_.l 

2 > 


y = i (c) x = 4 , y = - 1 


x= i, y = f 

x = 3y = 
All t ^ 0 
(c) 7/ - 4(i + 


-4 


j) 


10. (b) j = B - A, A = i(C - 7?) (c) J(15,4 - 14B + 5C) 

11. {^(}. {5}, {C}, {D}, {A, B), {A, C}, {A, D}, {B, C}, {C, D} 

12. (a) Independent (b) One example: D = A (c) One example: E = (0, 0, 0, 1) 
(d) For the choice E = (0, 0, 0, 1), we have X = 2A + B — C + 3E 


13. (c) t = 0, y/2, -Vl 

14. (a) {(1,0, 1, 0), (0, 1, 0, 1), (2, 0, -1, 0)} (b) The set given (c) The set given 

17. {(0, 1, 1), (1, 1, 1), (0, 1, 0)}, {(o, 1, 1), (1, 1, 1), (0, 0, 1)} 

18. {(1,1, 1,1), (0,1, 1,1), (0,0, 1,1), (0,0, 0,1)}, 

{( 1 , 1 , 1 , 1 ), ( 0 , 1 , 1 , 1 ), ( 0 , 1 , 0 , 0 ), ( 0 , 0 , 1 , 0 )} 

19. L{U) =L(T) =L(S) 

20. One example: A = {£ v . . . , E n }, B ~{E 1 + £> + E w , E n _ x + E n , E n + EJ 


12.17 Exercises (page 470) 

1. (a) -1 -i (b) -1 Si (c) 1 — i (d) -1 + i (e) -1 - i 

(f) 2 - / (g) -i (h) -l + 2 i (i) -3 - 2i (j) 2 i 

2. One example: (1 + i, -5 — 3 i, 1 — 3i) 

8. rrji 

9. 3A -B +2C 


Chapter 13 


13.5 Exercises (page 477) 

1. (b), (d), and (e) 

2. (a) and (e) 

3. (c), (d), and (e) 

4. (b), (e), and (f) 

5. (a) No (b) No (c) No 

6. A, B, C, D, F are collinear 

7. Intersect at (5, 9, 13) 

8. (b) No 

9. (a) 9 1 2 + 8t + 9 (b) 65 
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13.8 Exercises (page 482) 

1. (c) and (e) 

2. (a), (b), and ( c ) 

3. (a) x = 1 + t, y = 2 + s + t, z = \ + At 
(b) x=s + t, y = l+s, z=s+4t 

4. (a) (1, 2, 0) and (2, -3, -3) (b) M = {(1, 2, 0) + s(l, 1, 2) + /(— 2, 4, 1)) 

6. (a), (b), and (c) x — 2y + z = -3 

7. (a) (0, -2, -1) and (-1, -2,2) 

(b) M = {(0, -2, -1) + s( -1, 0, 3) + t( 3, 3, 6)} 

8. Two examples: (-5, 2, 6) and (-14, 3, 17) 

9. (a) Yes (b) Two examples: (1, 0, -1) and (-1, 0, 1) 

in (-2 a _ z) 

11. (a), (b) and (c) No 

13. x - y = - 1 

13.11 Exercises (page 487) 

1. (a) (-2,3, -1) (b) (4, -5, 3) (c) (4, -4,2) (d) (8, 10,4) 

(e) (8,3, - 7 ) (f) (10, 11, 5) (g) (-2, -8, -12) (h) (2, -2,0) 

0) (-2,0,4) 

1 (a) * v! ( “ 4 ' 3 ' (b) ± ym ( ~ 4 '- - ,8 ’ 7 > 

3. (a) - 1 /- (b) | v 35 (c) W 3 

4 . 8 i +j — 2 k 

6. (b) COS 6 is negative (c) V5 

9. (a) One solution is B = — / — 3k (b) j -j — A: is the only solution 

11. (a) Three possibilities; D = B + C — A = (0, 0,2), D = A + C — B = (4, 

D = A + B - C = (-2, 2, 0) (b) |V6 

12. -4; 8V3; -jV3 

13.14 Exercises (page 491) 

1. (a) 96 (b)_ 27 ( c ) -84 

2 . 0 , V2, -Vi 

3. 2 

6. (a) (2b — 1)/ + bj + ck, where b and c are arbitrary (b) -\i + f j 

11. -3i + 2j + 5k 

14. (b) 2 

15. (b) V 2005/41 

17. x = 1, y = -1, z = 2 

18. x = 1, y= -1, z = 2 

19. x = 1, y = 4, z = 1 

13.17 Exercises (page 496) 

1. (a) (-7, 2, -2) (b) -7x + 2v - 2z = 0 (c) -7x + 2y - 2z = - S 

2. (a) (i f, -f) (b) -7, -1,| (c) | (d) (-1,-14; 4) 

3 3x — y + 2z = — 5; 9/A/14 

4. (b) -i|V6 

5. (a) (1,2, -2) (b) x + 2y - 2z =5 (c) | 

6. lOx - 3y - 7z + 17 =0 

7. Two angles: w/3 and 2w/3 


1) 


- 2 , 2 ), 
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8. x + 2y + 9z + 55 = 0 

9. X(r) = (2, 1, -3) + r(4, -3, 1) 

10. (b) N = (1, 3, -2) (c) t = 1 (d) 2x + 3y + 2z + 15 = 0 

(e) a + 3y — 2z + 19 = 0 

11 . x +Vly + z = 2 +V2 

12 . 6 

13. — L=(7, -8, -3) 

V 122 

14. a - j ± z = 2 

15. (|,0, i) 

17. X{t) = (1,2, 3) + 7(1, -2, 1) 

19. (b) P = -i 5 -(5, -14,2) 

13.21 Exercises (page 503) 

3. r ~ edj(\ — e sin 0); r = —edl(\ + e sin 6) 

4. e = 1, d = 2 

5. e = |, d = 6 

6. e = 4, t/ = 6 

7. e = 2, d = 1 

8. e =2, d — 2 

9. e = 1, d =4 

10. d = 5, r = 25/(10 + 3 cos 0 + 4 sin 0) 

11. d = 5, r = 25/(5 + 4 cos 0 ■+ 3 sin 0) 

1 2 . d = lV2, r = l/(cos 0 + sin 0 4- -J\/2), r = l/(cos0 + sin 0 - Ja/2) 

13. (a) r = 1.5 X 10 8 /(1 + cos 0); 7.5 x 10 7 miles (b) r - 5 x 10 7 /(1 — cos0); 
2.5 x 10 7 miles 


13.24 Exercises (page 508) 

1. Center at (0,0); foci at (±8, 0); vertices at (±10,0); e = f 

2. Center at (0,0); foci at (0, ±8); vertices at (0, i- 1 0); e = | 

3. Center at (2, -3); foci at (2 ± \ 7, -3); vertices at (6, -3), (-2, -3); e = x'ljA 

4. Center at (0, 0); foci at ( ±4 0); vertices at (±|, 0); e= | 

5. Center at (0,0); foci at ( ±V3/6, 0); vertices at ( ±\/3/3, 0); e = | 

6. Center at ( — 1, -2); fociat ( — 1,1), ( — 1, -5); verticesat ( - 1 , 3), ( - 1, -7); e =| 

7. 7 A' 2 + 16/ = 7 

„ (A + 3) 2 (y - 4) 2 , 

8 - 9 

(x + 3) 2 (y - 4) 2 
9 16 1 

10 . - i 4)2 + (y - 2) 2 = 1 


11 . 

12 . 

13. 

14. 

15. 


(X - 8) 2 (y + 2) 2 
25 + 9 

(A - 2) 2 (y - l) 2 _ 

16 4 _ 

Center at (0, 0); foci at (±2'\/41, 0); vertices at ( ±10,0); g = 'V / 41/5 
Center at (0,0); foci at (0, ±23/41); vertices at (0, ±10); e = \/41/5 
Center at (-3, 3); foci at (-3 ± V5, 3); vertices at (-1, 3), (-5, 3); e= V / 5/2 
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16. Center at (0,0); foci at (±5, 0); vertices at (±4,0); e = 5/4 

17. Center at (0,0); foci at (0. ±3); vertices at (0, ±2); e = f 


18. 


19. 


20 . 


Center at (1, -2); foci at (1 ± -\J 13, -2); vertices at (3, —2), (-1. -2); 


|V 13 


x 2 v 2 

4 - n- 1 

y 2 — x 2 = 1 


21 . -t-tz- 


r 

16 


22. (y - 4) 2 


1 

(x + l) 2 


= 1 


23 - 

24. ±W 

25. 4x 2 — y 2 - 11 

26. Vertex at (0,0); directrix x = 2; axis y = 0 

27. Vertex at (0,0); directrix x = — | ; axis y = 0 

28. Vertex at (J, 1); directrix x = — |; axis y = 1 

29. Vertex at (0,0); directrix y = — | ; axis x = 0 

30. Vertex at (0,0); directrix y = 2; axis x = 0 

31. Vertex at (-2, — |); directrixy = — c?-; axis x = -2 

32. x 2 = -y 

33. y 2 = 8x 

34. (x + 4) 2 = -8(y - 3) 

35. (y + l) 2 = 5(x - ') 

36. (x - If = 2(v + i) 

37. (y - 3) 2 = -8(x - 1) 

38. x 2 - 4xy + Ay- + 40x + 20 ly - 100 = 0 


13.25 Miscellaneous exercises on conic sections (page 509) 

3. B > 0, A = |(1 + V5 )B 

4 . §()/) 

5. I677 

6. (a) | (b) 2 t 7 ( C ) 48tt/5 

7. x 2 / 1 2 + y-116 = 1 

8. x 2 - 2xy + y 2 - 2x - 2y = 1 

9 . y 2 - 4x 2 - 4y + 4x = 0 

10. (a) e = V2lip + 2); foci at (V2, 0) and (-V2, 0) (b) 6x 2 -> 3_y 2 = 4 

15. (b) y = Cx 2 , C ^ 0 

16. (4, 8) 

17. (a) x = | a (b) llpcf = 4 fl 3 

18. (x-j-)” +(y - f) 2 = f 


Chapter 14 


14.4 Exercises (page 516) 

1. F’(i) = (1, 2t, 3 1 2 + 4/ 3 ) ; F”(t) = (0, 2, 6/, 12/ 2 ) 

2. F’(t) = (-sin t , sin 2 1, 2 cos It, sec 2 1); F”(t) = (— cos t, 2 cos 2/, -4 sin 2/, 2 sec 2 1 tan t) 

3. F’(r) = ((1 - t 2 ) 112 , -(1 - t 2 r 112 )' F”(r) = (,(1 + t 2 )- 3l \ -1(1 + l 2 )- 3/2 ) 
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4. F’(t) = (2e\ 3e‘); F ”W = (2e t , 3e*) 

5. F’(t) = (sinh t, 2 cosh 2t, F"(t) = (cosh t, 4 sinh 2t, 9e~ 3t ) 

6. F’(t) = (2,1(1 + t 2 ), 1/(1 + t 2 ), -2//(l + t 2 j)\ 

F”(t) = ((2 - 2t 2 )/(l + t 2 ) 2 , -2tl(\ + t 2 ) 2 , (6 t 2 - 2)/(l + t 2 ) 3 ) 

8. (hie- 1) 

9. (l-iV2,iV2,logiV2) 

/ 1 -te 1 +e\ 

!0- |log— 2~,1 -log-^-j 

11. (1, e - 2. 1 - 2/e) 

12 . 0 

15. G'(/) = /•(/) x F”(0 

20. F(t) = + \t 2 B + tC + D 

22. F'(l) = A, F(3) = (6 + 3 log 3)A 

23. F(x) = e x (x + l)A - eA 


14.7 Exercises (page 524) 

1. u(t) = (3 — 3 t 2 )i + 6 tj + (3 + 3 t 2 )k\ a(t) = -6ti + 6 j + 6tk; v(t) = 3 \/2 (1 + t 2 ) 

2. u(t) = -sin 1 i + cos ; j + e t k\ a(t) = -cos / i — sin tj + e*/); u(t) = (1 + e 2 ‘) 1/2 

3. u(t) = 3(cos t — t sin t)i + 3(sin t + t cos t)j + 4k; u(t) = —3(2 sin t + 1 COS tji + 
3(2 cos / -t sin t)j- v(t) = (9 f + 25) 1/a 

t t 

4. u(t) =(1 — cos t)i + sin tj + 2 cos - k; u(t) = sin / i + COS tj — sin - k\ v(t) = 2 

5. u(t) = 6ti + 6 t 2 j + 3k; aft) = 6i -f 12//; v(t) = 6 1 2 + 3 

6. u(t) = i + cos tj + sin t k; a(t) = —sin tj + cos t k; v(t) = \2 
9. A = aba> 3 , B = a 2 v/‘ 

11. (b) 8e 4f /cos 2 0 

15. (a) x(t) = 4 COS 2t, y(t) = 3 sin 2t (b) x 2 / 16 + y 2 /9 = 1 

16. 37/4 


14.9 Exercises (page 528) 


1. 

2 . 

3. 

4. 

5 . 

6 . 
9. 
11 . 

12 . 

13. 

14. 

15. 


(a) T = -ijV'2 (-3i + 4j + 5k); N = -|/ - f j (b) a = 1 2\/2 T + 6N 


(a) T = -(1 + e 2 *l l!2 j + e*(\ + e 2 *)~ ll2 k; N 

(b) a = (1 + e 2 ’’)~ ll2 [e 2 ”T + (1 + 2 e 2 ”) ll2 N] 


(1 + e 2n )i -t e 2n j + e n k 
(1 + e 2 ’) 1 ^ + 2e 2 ”) 1/2 


( a ) T = fi + f k; N = j (b) a = 6N 

(a) T = i; N = (j + k); (b) a = V2 N 

( a ) T = 1(2 i + 2j + k); N =-- |(i + 2 j- 2k) (b) a = 12E + (,N 

( a ) T = 5 \/ 2 i" + / J + \k\ N = ~\yj 2 j A- / \ 2 k (b) a = N 


Counter example for (b) and (d): motion on a helix 

One example: r(t) = 2|e 2t cos t dt i + 2je 2t sin t dtj + e u k\ v(t) makes a constant angle 
with k, but «(/) is never zero nor parallel to v(t) 

(a) Counterclockwise (b) 3 (c) 27 t/'V / 3 

x 2 I3 + y 2 / 4 =1 

y 2 = 4x; y 2 = 8 — 4x 

(b) Mil ||5|| sin e 


14.13 Exercises (page 535) 

1. 8a 

2. \/2(e 2 - 1) 
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3 . 2n 2 a 

4. 4(a s - b 3 )l(ab ) 

l T \ lyp2 cosh (r/2) + V / cosh T\ 

5. 2a^cosh~ ycosh T — 1 j — V2a log ^ ^ ^ J 

6. ly/lrr 
1 . 50 

8. V2 log (1 + a/2) 

9. |to| Vo 2 + («i - to) 

10. 1 + [^'(f)] 2 tty 

26\ H - 16 
1L ““27— 

13. (a) | Vl+ e 2 * dx (b) j J?-+j 2 dt 

2 

14. (c) c f 
16. /(*) = 

19. c(/) = 1 + 2t; 3 units of time 


k cosh | . + C J , or f(x) = k 


14.15 Exercises (page 538) 

1. (1) -L (2) (1 +2^)'/2(l + ^)-3/2 (3) £ (4) iV2 (5) -if (6) 

1 

||U|| sin 6 

4. (a) x = z 

7. k = ||a||/||r|| 2 

9. a = j, b = 2; intersect at (0, 0) 

10. Vertex at -1 COS 0 A + \ COS 2 d B 

11. (a) a (?) = |tt — 5/ 2 (b) jt(/) = 5 sin 5l 2 i + 5 cos 5( 2 y 

12. V2 i + Vlj 


14.19 Exercises (page 543) 

1. v(t) = u r + tu e \ a(t ) = -tu r + 2u e ; K(t) = (2 + r 2 )(l + r 2 ) _3/ 2 

2. (a) »(/) = « r + tu, + k ; aft) = -fu, + 2 « e ; «(?) = (( 4 + 5( 2 + 8) l/2 (2 + ? 2 )~ 3/2 
(b) arccosV2/(2 + t 2 ) 

3. (b) ^7T — t 

5. 32 

-\/l + c 2 e 4 ™ - 1 

6. (b) 1(c) = (e 2 « - 1) tfc ^0; 1(0) = 2 tt. a(c) = —7 if c ^ 0; 

o(0) = 77 

7. (a) 3 tt/ 16 (b) 2 + ^a/ 3 log (2 + \/3) 

8. f7f(77 2 + 1) 1/2 + i log (77 + \/ TT 2 + 1) 

9 . V2 {e v - 1 ) 

10 . 4 

11 . 8 


toll-* 
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13. (a) (0 2 + l)3/2/ (9 2 + 2 ) (b) V2e e (c) IV2 * V2 (d): %y/l 

15. r = r a e 11 cot ®; target at origin, missile starts at r = r 0 , 6 = 0; a denotes the angle, 0 < a < tt, 
determined by V and —f\ for 0 < a < tt/ 2 the path is a spiral for which r -*■ 0 as 0 increases 
indefinitely; for a = tt/ 2 it is a circle about the origin; for w/2 < a < u it is a spiral for 
which r increases indefinitely as 0 increases indefinitely. 

16. Use as positive x-axis the line from position sighted four miles away to ground crew. Proceed 
three miles along this line (to allow for the possibility that the missile is returning to base) 
and then follow the spiral r = e 6 ^ 8 

17. log Vx 2 + y 2 + arctan ( yjx ) = C 

14.21 Miscellaneous review exercises (page 549) 

1. tan a = tan 6/(2 + tan 2 0) 

3 . ( cjm 2 , 2 elm) 

4. (a) y - y, = m{x - x 0 ) + c/m ; tangent at (x 0 + c/m 2 ,y, + 2 c/m) 

(b) y — y 0 = rn(x - X 0 ) — cm s ; tangent at (x 0 + 2 cm, y 0 + cm 2 ) 

6. (fi - .yoXj ~ y 0 ) = 2 c(x + X 1 - 2x 0 ); x x y = 2 y x x - x^, 

(xj - x 0 )(y - y 0 ) = 2(_y 1 - y 0 )(x - x 0 ) - (x x - Xo)^ - j 0 ) 

7. (a) (0, |) 

(b) Write Q = (0, b(x)). If /"( 0) ^ 0 then b(x) -f(0) + ^ as .y -> 0. 

Otherwise, |Z)(x)j -»■ + 00 as x 0. 

1 + c 1 

s . r - — >-y as C ^ 0 

13. (2, 1), (-2, -1) 

14. l\/2 

15 . 3 ,'-/- 3 

21. (a) /(0) = k sin (0 + C), or/(6) = A' 

(b) f(0) = Ce°^ k ~~ l , where A 2 > 1 

(c) f(fi) = (2/A) sec (0 + C), or f(0) = 2/A 
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15.5 Exercises (page 

555) 







1 . Yes 

8. 

Yes 

15. 

Yes 

22. 

Yes 


2 . Yes 

9. 

Yes 

16. 

Yes 

23. 

No 


3. Yes 

10. 

Yes 

17. 

Yes 

24. 

Yes 


4. Yes 

11. 

No 

18. 

Yes 

25. 

No 


5. No 

12. 

Yes 

19. 

Yes 

26. 

Yes 


6. Yes 

13. 

Yes 

20. 

Yes 

27. 

Yes 


7. Yes 

14. 

No 

21. 

Yes 

28. 

Yes 


31. (a) No (b) 

No 

(c) No 

(d) No 





15.9 Exercises (page 

560) 







1. Yes; 2 

5. 

Yes; 1 

9. 

Yes; 1 

13. 

Yes; 

n 

2 . Yes; 2 

6. 

No 

10. 

Yes; 1 

14. 

Yes; 

n 

3. Yes; 2 

7. 

No 

11. 

Yes; n 

15. 

Yes; 

n 

4. Yes; 2 

8. 

No 

12. 

Yes; n 

16. 

Yes; 

n 
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17. Yes; dim =1 -(- \n if n is even, \{n + 1) if n is odd 

IS. Yes; dim = \n if n is even, i(n + 1) if n is odd 

H. Yes; k + 1 

10, No 

21. (a) dim = 3 (b) dim = 3 (c) dim = 2 (d) dim = 2 

13, (a) If a ^0 andft^O, set is independent, dim = 3; if one of a or ft is zero, set is dependent, 

dim = 2 (b) Independent, dim = 2 (c) If a ^0, independent, dim = 3; if a = 0, 

dependent, dim = 2 (d) Independent; dim = 3 (e) Dependent; dim = 2 (f) 

Independent; dim = 2 (g) Independent; dim = 2 (h) Dependent; dim = 2 

(i) Independent; dim = 2 (j) Independent; dim = 2 


15.12 Exercises (page 566) 

1. (a) No (b) No (c) No 

8. (a) ivV + 1 (b) g(x) = b\x - 

(n + 1X2/1 + 1 ) + n + 1 

6 7l Q 2 

11. (c) 43 (d) g(t) = a(l — ft), a arbitrary 

12. (a) No (b) No (c) No (d) No 

13. (C) 1 (d) e 2 - 1 

14. (c) «;/2 n+1 


10. (b) 


(d) No 
e 2 + 1 
4 

(c) g(t) 


(e) Yes 

, ft arbitrary 

In + 


a\t 


3 n 


aarbitrary 


15.16 Exercises (page 576) 

1. (a) and (b) ; 'V / 3(1, 1, 1), kV 6 (1,-2, 1) 

2. (a)[V / 2(l,l,0,0U\/6(-l, 1, 2,0), jv 7 ! (1, - 1 , 1,3) 

1 

(b) -h/3(i, 1,0, 1), ^/=0. -2,6, i) 


6, l - l log 2 3 

7. e 2 - 1 

S, i(e - e- 1 ) +«;- 1 -7e~ 2 

e 

9, 7 f 1 — 2 sin x 

10 . 


16.4 Exercises (page 582) 

1. Linear; nullity 0, rank 2 

2. Linear ; nullity 0, rank 2 

3. Linear; nullity 1, rank 1 

4. Linear; nullity I, rank 

5. Nonlinear 

6. Nonlinear 

7. Nonlinear 

8. Nonlinear 

9. Linear; nullity 0, rank 2 

10. Linear: nullity 0, rank 2 

11. Linear ; nullity 0, rank 2 

12. Linear ; nullity 0, rank 2 


Chapter 

16 


13, 

Nonlinear 


14, 

Linear; nullity 

0, rank 2 

15. 

Nonlinear 


16. 

Linear; nullity 

0, rank 3 

17. 

Linear; nullity 

1, rank 2 

18. 

Linear ; nullity 

0, rank 3 

19. 

Nonlinear 


20, 

Nonlinear 


21, 

Nonlinear 


22, 

Nonlinear 


23, 

Linear; nullity 

1 ( rank 2 

24, 

Linear; nullity 

0, rank n 
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25. Linear ; nullity 1, rank infinite 26. Linear; nullity infinite, rank 2 

27. Linear; nullity 2, rank infinite 

28. N(T) is the set of constant sequences; T(V) is the set of sequences with limit 0 

29. (d) {1 , cos x, sin x} is a basis for T(V); dim T(V) = 3 (e) N(T) = S (f) If T(f) = 

cf with c ^0, then c £ T(V) so we have f(x) = c 2 + c 2 cos x + ca sin x; if c x = 0, then 
c = u and f(x) = Cj COS x + c 2 sin x, where c h c 2 are not both zero but otherwise arbitrary; 
if Cj -A 0, then c = 2 u and j(x) = c 1( where Cj is nonzero but otherwise arbitrary. 


16.8 Exercises (page 589) 


3. 

Yes; 

x = v , y — u 




4. 

Yes; 

x = u, y = - v 




5. 

No 





6. 

No 





7. 

No 





8. 

Yes; 

x = log u, y = log v 




9. 

No 





10. 

Yes; 

x — u — 1 , y = v - 1 




11. 

Yes; 

X = l(v + «), y = i(v - u) 




12. 

Yes; 

x = l(v + u), y = : j(2r - u ) 




13. 

Yes; 

x = w, y = v, z = u 




14. 

No 





15. 

Yes; 

* = «, y = \v, i -- \w 




16. 

Yes 

x = u, y =v, z ~ w — u - v 




17. 

Yes; 

x = u — 1, y = v — 1, z = w + 1 




18. 

Yes; 

x = u - 1, y = v - 2, z = w - 3 




19. 

Yes, 

x = u, y = v — u, z — w - v. 




20. 

Yes; 

x = \(u — v + w), y = \(v — w + u)\ 

2 = - 

■ u + 

V) 

25. 

(S + 

Tf = S 2 +ST+TS + T 2 ; 





(S + 

Tf = S z + TS 2 + STS + S 2 T + ST 2 + 1ST + r 2 5 + 

T s 


26. 

(a) ( ST)(x,y , z) = (x + y + z, x + y, x); ( TS)(x,y , z) = 

(z,z 

+ J,Z + y + x); 


(ST ~ TS)(x, y, z) = (x + y, x — z, -y — z); 

y, z) 

= (x. 

y. z); 


T\x, y, z) = {x, 2x + y, 3x + 2y + z); 

(STf(x,y, z) = (3x + 2y + z,2x + 2y + z, x +y + z); 

( TSf(x,y , z) = (x + y + z, x + 2y + 2z, x + 2y + 3 z); 

(ST - TS) 2 = (2x + y - z, x + 2y + z, -x + y + 2z); 

(b) S~\u, v, w) = (w, v, u); T~ l (u, v, >v) = («, v — u, w — v); 

(ST)~ 1 (u, v, w) = (w, v w, u — v); (TS)-‘(u, v, w) = (w — v, v — u, u) 

(c) (7’-/)(x, y, z) = ( 0,x,x + y); (T -If(x,y, z) = (0,0, x); 

(T - If(x, y, z) = (0, 0, 0) if n> 3 

28. (a) Dp(x) = 3 - 2x + 12x 2 ; Tp(x) = 3x - 2x 2 + 12x'; (DT)p(x) = 3 - 4x + 36x 2 ; 

(TD)p(x) = -2x + 24/; ( DT - TD)p(x) = 3 - 2x + I2x 2 ; 

(T 2 D 2 — D 2 T 2 )p(x) = 8 — 192x (b) p(x) = ax, a an arbitrary scalar 

(c) p(x) = ax 2 + b, a and b arbitrary scalars (d) All p in V 

31. (a) Rp(x) = 2; Sp(x) = 3 -■ x + x 2 ' Tp(x ) = 2x + 3;t 2 — x s + x 4 ; 

(ST)p(x) = 2 + 3x - x 2 + x 3 ; (TS)p(x) = 3x - x 2 + x 3 ; (TSfp(x) = 3x - x 2 + x 3 ; 
(T 2 S' l )p(x) = -x 2 + x 3 ; ( S 2 T 2 )p(x ) = 2 + 3x - x 2 + x 3 ; (TRS)p(x) = 3x; 

(RST)p(x) = 2 

(b) N(R) = {p p( 0) = O}; R(V) = {p p is constant}; N(S )={p p is constant}; 
S(V)= V; N(T)-( O); T(V) = {p \ p(0) = 0 j (c) = S 

(d) ( TS) n ~I-R ; S n T n = I 

32. T is not one-to-one on V because it maps all constant sequences onto the same sequence 
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16.12 Exercises (page 596) 

1 . (a) The identity matrix Z = (<5 j# .), where 6 jk = 1 it] = k, and S jk = 0 ifj k 

(b) The zero matrix 0 = ( a jk ) where each entry a jlc = 0 

(c) The matrix (cS jk ), where (d jk ) is the identity matrix of part (a) 


2. (a) 






"0 

1 

0 

0 

o' 

'1 0 O' 

(b) 

"0 1 O' 





-0 1 0_ 

-0 0 lj 

(C) 

0 

0 

1 

0 

0 





0 

0 

0 

1 

0 


3. (a) -5i + lj, 9i 
'1 2 " 


(b) 


1 -1. 


12j 

3 0" 

LO 3. 


(c) 


1 7 . _r 

4 4 


L 4 


3 0 

Lo 3j 


4. 


1 

0" 


4 

O' 

L o 

2_ 

’ 

_0 

4_ 


5. (a) 3 i + 4j + 4k; nullity 0, rank 3 (b) 


1 


1 


1 -3 

1 -5 


6 . 


2 0 

1 -1 

2 1 


7. (a) T(4i j + k) = (0, -2); nullity 1, rank 2 

"0 1 3~ 


(b) 


1 1 
1 -1 


(c) 


0 0-2 
8. (a) (5, 0, -1); nullity 0, rank 2 


(d) e 1 = j, e 2 = k, e 3 = i, w x =(1,1), w 2 = (1, -1) 


(b) 


1 -1 
0 0 

1 1 


(c) ej = i, e 2 — i +j, Wj =(1,0, 1), w 2 = (0, 0, 2), vr 3 = (0, 1,0) 


9. (a) (-1, -3, -1); nullity 0, rank 2 (b^L 

(c) e 1 =i, e 2 =j-i, Wj = (1, 0, 1), w~ (0,1,0), w 3 ~ (0,0,1) 

.1 2 

10. (a) e 1 — e 2 ; nullity 0, rank 2 (b) 


5 4 


LO ll 
I, ( 

(c ) a = 5, b = 4 


11 . 


0 -1 
1 0 


-1 0 
0 -1 



"0 1 O' 


'0 

0 o' 


12. 

0 0 0 

» 

0 

0 0 



0 0 1 


0 

o 1_ 



r° 1 

f 


'0 

0 o' 

13. 

0 0 

-1 

, 

0 

0 -1 


|0 0 

1 


0 

0 1 
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Answers to exercises 


1 1 

0 1 

0 -1 


-1 0 
0 -1 

O'" 

1 

9 

- 1 • 

0 

0 - 2 ] 


0 - 2 
2 0 


2 -3 

3 2 


2J 12 -5 

0 0 0 o" 

0 10 0 

(t 

0 0 2 0 
0 0 0 3 
0 -1 0 0 " 

0 0-20 
0 0 0 -3 



"o 

1 

0 

o" 


0 

0 

0 

0 


0 

0 

4 

0 


0 

0 

2 

0 

(b) 

0 

0 

0 

9 

(c) 

0 

0 

0 

6 


0 

0 

0 

0 


0 

0 

0 

0 


0 0 0 0 
0 10 0 
0 0 4 0 
0 0 0 9 


0 - 8 

0 

0 0 

A 0 


Choose (x 3 , x 2 , x, 1) as a basis for V, and ( x ~ , x) as a basis for W. Then the matrix of TD is 


16.16 Exercises (page 603) 

[3 4*1 r 

AD 15 ~ 14 

1. B + C = 0 2 , AB = , BA 

-15 14 

6-5 L J 





"0 

0 

"o 

0" 

CA = 

2 

-8 

0 

0 



L 

- 1 


_4 

-16 


L J [4 -16 8J 

fa 1 f - 2 a 

2. (a) 1 a and b arbitrar) (b) 


-1 

4 

-2“ 

-4 

16 

-8 

7 

-28 

14 


30 

-28 


-30 

28 


, a and b arbitrary 


3. (a) a =9, b = 6, c = 1, d = 5 (b) a = 1, b = 6, c = 0, d = 


4. (a) 


-9 -2 -10 

6 14 8 (b) 

-7 5 -5 


-3 5 -4 

0 3 24 

12 -27 0 
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cos nO —sin nO 
sin nO cos nd j 


8. A" 


1 n 


01 


00 


T 


-100 1 


where b and c are arbitrary, and a is any solution of the equation (j 2 = -be 



, where a is arbitrary 


0 -1 


where b and c are arbitrary and a is any solution of 


the equation a 2 = 1 — be 


IS 13 
2 2 


13 19 

4 '4 


43 2 5 

4 4 


14. (b) (A + Bf = A 2 + AB + BA + B 2 ; (A + B){A - B) = A 2 + BA _ AB - B 2 
(c) For those which commute 

16.20 Exercises (page 613) 

1. (X, y, Z) = (f, -h !) 

2. No solution 

3. (x,y,z)=( 1, -1,0)+ /(— 3, 4, 1) 

4. (x,y,z) = (1, -1,0) + ((-3,4, 1) 

5 . (x,y, z, u) = (1, 1,0, 0) + /(l, 14, 5, 0) 

6. {x,y, z, u)= (1, 8, 0, -4) + t{ 2, 7, 3, 0) 

7 . (x,y, z, u, v) = — 1 , 1,0, 0, 0) + /,( — 1,0, 3, -3, 1) 

8. {x,y,z, u) = (1, 1, 1, - l ) + / x ( — 1 , 3,7,0) + /,(4,9, 0, 7) 

9. (x,y, z) = (3, |,0) + t( 5, 1, -3) 

10, (a) (x,y,z, u) = (1, 6,3,0) + /‘ 1 (4, 11, 7,0) + / 2 (0, 0,0, 1) 

(b) (x, y, z, u ) = (iV* 4. {I, 0) + r(4, -11,7, 22) 

f-i 2 n r i4 s 3i 


3 2 1 
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Answers to exercises 


0 \ o-i o r 

1 0 0 0 0 0 

0 0 0 1 0 -1 


0 1 0 


16.21 Miscellaneous exercises on matrices (page 614) 


_L5 -1J 

"o oi r i o“ 

_o oj .0 1. 


wfjere b and c are arbitrary and a is any solution of the 


quadratic equation q 2 — a + be = 0 


io. (a) 


-l -r 

-i i_r 
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ABEL, NIELS HENRIK, 407 

Abel's partial summation formula, 407 
Abel's test for convergence, 408 
Abscissa, 48 

Absolute convergence of series, 406 
Absolute maximum and minimum, 150 
Absolute values, 41, 363 
Acceleration, 160, 521 
in polar coordinates, 541 
normal and tangential components of, 527 
Addition formulas for the sine and cosine, 96 
Additive property: 
of arc length, 532 
of area, 59 

of averages, 119 (Exercise 13) 
of convergent series, 385 
of derivatives, 164 
of finite sums, 40 
of the integral, 66, 67, 80, 514 
of the supremunt and infimum, 27 
of volume, 1 12 
of work, 115 
Alternating series, 403 
Analytic geometry, 48, 471 
Analytic model of Euclidean geometry, 471 
Angles : 

in a Euclidean space, 564 
in n-space, 458 
radian measurement of, 102 
Angular acceleration, 545 (Exercise 19) 
Angular speed, 522, 545 (Exercise 19) 

Angular velocity, 545 (Exercise 19) 
Antiderivative (primitive), 205 
APPOLONIUS, 498 
Approximations: 

by polynomials, 272-304, 575 
by trigonometric polynomials, 575 
in a Euclidean space, 574 
Arbitrary constant, 211, 307 
ARBOGAST, LOUIS, 171 
Arc cosine, 254 

Archimedean property of real numbers, 26 


ARCHIMEDES, 2-9, 26 

Arc length: 

as an integral, 534 
definition of, 530, 531 
function, 533 

in polar coordinates, 544 (Exercise 4) 
Arc sine, 253 
Arc tangent, 255 

Area: 

and similarity transformations, 92 
axiomatic definition of, 57-59 
in polar coordinates, 110 
of an ordinate set, 75 
of a radial set, 1 10 
of a region between two graphs, 88 
Argument of a complex number, 363 
Arithmetic mean, 46, 117 
Associative law : 

for addition of numbers, 18, 359 
for addition of vectors, 447 
for composition of functions, 141, 584 
for multiplication of numbers, 18, 359 
for union and intersection of sets, 14 
in a linear space, 551, 552 
Asymptote, 190 
of a hyperbola, 506 
Asymptotically equal, 396 
Average, 46, 149 

of a function, 117-1 19 
rate of change, 160 
velocity, 157 
weighted, 118 
Axes, 48, 197 
Axiom(s): 

completeness (continuity), 25 
field, 17 

for a linear space, 551, 552 
for area, 58, 59 

for the real-number system, 17-25 
for volume, 112 
least-Upper-bound, 25 
order, 20 
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Index 


Axiomatic development : 
of area, 57-59 
of inner products, 561 
of the real-number system, 17-25 
of vector algebra, 551, 552 
of volume, 11 1-1 12 

BARROW, ISAAC, 157 

Base of logarithms, 232 
Basis, 327, 466 
Bernoulli : 

differential equation, 312 
inequality, 46 (Exercise 14) 
polynomials, 225 (Exercise 35) 

BERNOULLI, JOHANN, 235, 292, 305, 331 
BERNSTEIN, SERGEI, 437 

Bernstein's theorem, 437 
Bessel functions, 443 (Exercise 10) 

Binary scale, 393 

Binomial coefficient, 44, 383, 442 

Binomial series, 377, 441 

Binomial theorem, 44 (Exercise 4), 378, 442 

BOLYAI, JOHANN, 474 

BOLZANO, BERNARD, 143 

Bolzano’s theorem, 143 

BOOLE, GEORGE, H 

Bound: 

greatest lower, 25 

least upper, 24 
upper and lower, 23-25 
Bounded function, 73 
Bounded sequence, 38 1 
Bounded set of real numbers, 23 
Boundedness of continuous functions, 150 
BROUNCKER, WILLIAM, 377, 390 

Calculation: 
of e, 281 

of logarithms, 240-242 
of 7r, 285 (Exercise 10) 
of square roots, 444 (Exercises 20, 21) 
Calculus, fundamental theorems of, 202, 205, 
515 

CANTOR, GEORG, 11, 17 
CARDANO, HIERONIMO, 3 

Cartesian equation, 49, 475, 494 
Cartesian geometry, 48 

CAUCHY, AUGUSTIN-LOUIS, 3, 42, 127, 172, 186, 
284, 368, 378, 397, 399,411, 452 
Cauchy-Schwarz inequality, 42, 452, 563 
Cauchy’s mean-value formula, 186 
Cauchy’s remainder in Taylor’s formula, 284 
CAVALIERI, BONAVENTURA, 3, 111 
Cavalieri solid, 1 1 1 
Cavalieri’s principle, 111, 112 


CAYLEY, ARTHUR, 446 

Center of mass, 118 
Centrifugal, 522 
Centripetal. 522 
Chain rule, 174, 514 
Characteristic equation, 327 
Circle, 49, 521 

of convergence, 428 
Circular helix, 523 
Circular motion, 521 
Class of sets, 14 
Closed interval, 60 
Closure axioms, 551 
Coefficient matrix, 605 
Column matrix (column vector), 592, 598 
Commutative law: 

for addition of numbers, 18, 359 
for addition of vectors, 447 
for dot products, 451 
for inner products, 561 
for multiplication of numbers, 18, 359 
for union and intersection of sets, 14 
in a linear space, 551 
Comparison tests for convergence: 
of improper integrals, 418 
of series, 394-396 

Comparison theorem for integrals, 67, 81 
Complement of a set, 14 
Complex Euclidean space, 562 
Complex function, 368 
Complex linear space, 552 
Complex numbers, 358-373 
Complex vector space, 468 
Composite function. 140, 584 
continuity of, 141 
differentiation of, 174, 514 
Composition of transformations, 584 
Concave function, 122, 189 
Conditional convergence, 406 
Congruence of sets, 58 
Conic sections, 497-507 
Conjugate complex number, 364 
Constant function. 54 
Continuous functions: 

definition of, 130, 369, 513 
integrability of, 153 
theorems on, 132, 141-154 
Contour lines, 197 
Convergence : 

of improper integrals, 416, 418 
of sequences, 379 
of series, 384-425 
pointwise, 422 
tests for, 394-408 
uniform, 424 

Convex function, 122, 189 
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Convex set, 112 
Coordinates : 
cylindrical, 543 
polar, 108, 540 
rectangular, 48, 197 
Copernican theory, 545 
Cosine function : 

continuity of, 134, 139 
differential equation for, 323 
differentiation of, 162 
integration of, 100, 207 
power series for, 436 
properties of, 96 
Cotangent function, 103 
Cramer’s rule, 491 
Critical point, 188 
Cross product (vector product), 483 
Curvature, 537 
Curve : 

definition of, 517 
length of, 529-535 

nonrectifiable, 530, 536 (Exercise 22) 
rectifiable, 530 
Cycloid, 536 (Exercise 20) 

Cylindrical coordinates, 543 


Damped vibrations, 335 

DANDELIN, GERMINAL P„ 498 

Decimal expansion of real numbers, 30, 393 
Decreasing function, 76 
Decreasing sequence, 381 
DEDEKIND, RICHARD, 17 
Deductive systems, 8 
Definite integral: 
definition of, 73 
properties of, 80, 81 
De Moivre’s theorem, 371 (Exercise 5) 
Dependence, linear, 463, 557 
Derivatives : 

and continuity, 163 
functions of one variable, 160 
functions of several variables, 199-201 
notations for, 160, 171, 172, 199, 200 
of complex-valued functions, 369 
of higher order, 160, 200 
of vector-valued functions, 513 
partial, 199-201 
theorems on, 164 
DESCARTES, RENE, 48, 446 
Determinant, 486 
Difference: 
of functions, 132 
of real numbers, 18 
of sets, 14 
of vectors, 447 


Difference quotient, 157, 159, 517 
Differential equations, 305-357 
first-order linear, 308 
homogeneous first-order, 347-350 
power-series solutions of, 439-443 
second-order linear, 322 
separable, 345 

Dimension of a linear space, 559 
Direction field, 343 
Directrix of conic sections, 500 
DIRICHLET, PETER GUSTAV LEJEUNE, 407 
Dirichlet’s test for convergence, 407 
Discontinuity: 
infinite, 13 1 
jump, 131 
removable, 13 1 
Disjoint sets, 14 
Distance: 

between two planes, 495 
between two points, 364, 462 
from a point to a line, 476, 477 
from a point to a plane, 494 
Distributive law : 
for cross products, 483 
for inner products, 451, 561 
for numbers, 18, 359 
for set operations, 16 (Exercise 10) 
in a linear space, 552 
Divergent improper integral, 416. 418 
Divergent sequence, 379 
Divergent series, 384 
Division : 

of functions, 55 
of numbers, 18, 360 

Domain of a function, 50, 53, 196, 512, 578 
Dot product (inner product), 451, 469, 562 
Duodecimal scale, 393 


e (base of natural logarithm): 
computation of, 281 
definition of. 231 
irrationality of, 282 
Earth, 545 

Eccentricity of conic sections, 500 
Electric circuits, 317, 336 
Element : 

of a determinant, 486 
of a matrix, 592, 598 
of a set, 1 1 

Elementary function, 282 
Ellipse, 498, 500, 506 
Elliptic integral, 535 (Exercise 17) 
Elliptic reflector, 519 
Empty set, 13 

Endpoints of an interval, 60 
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Envelope, 342 
Equality : 

of complex numbers, 358 
of functions, 54 
of sets, 12 
of vectors, 447, 468 
Equipotential lines, 351 
Error in Taylor’s formula, 278, 280 
EUCLID, 9 , 471 
Euclidean geometry, 9, 471 
Euclidean space, 472, 561 
euler, Leonard, 231, 377, 396, 405,420 
Euler’s constant. 405 
Even function, 84 (Exercise 25) 

Even integer, 28 (Exercise 10) 

Exhaustion, method of, 2-8 
property of area, 59 
Existence theorems, 308, 323 
Exponential function: 
complex-valued, 367 
definition of, 242 
derivative of, 243 
integral of, 246 
power series for, 436 
Extremum: 

definition of, 182 
tests for, 182, 188, 189 

Extreme-value theorem for continuous func- 
tions, 151 


Factorials, 44, 52 

Family of curves, 341, 351 

FERMAT, PIERRE DE, 3, 156 

FERRARI, LODOVICO, 3 

Fibonacci (Leonardo of Pisa), 379 

Fibonacci numbers, 46 (Exercise 16), 379 

Field axioms, 17 

Fixed point, 145 (Exercise 5) 

FOCUS of a conic section, 498 
FOURIER, JOSEPH, 127, 575 
Fourier coefficients, 575 
Frequency of simple harmonic motion, 339 
Function(s): 
bounded, 73 

characteristic, 64 (Exercise 8) 

complex-valued, 368 

concave, 122 

constant, 54 

continuous, 130 

convex, 122 

decreasing, 76 

defined hy an integral. 120 

domain of, 53 

elementary, 282 

even, 84 (Exercise 25) 


exponential, 242, 367 
factorial, 52 
formal definition of. 53 
gamma, 419, 421 (Exercise 19) 
greatest-integer, 63 
hyperbolic, 25 1 
identity, 5 1 
increasing, 76 

informal description of, 50-52 
integrable, 73 
inverse, 146, 252 
inverse trigonometric, 253-256 
linear, 54 

logarithmic, 229-235 
monotonic, 76 
notation for, 50, 196, 512 
odd, 84 (Exercise 25) 
of several variables, 196 
periodic, 95 
piecewise linear, 123 
piecewise monotonic, 77 
polynomial, 55 
power. 54 
range of. 53 
rational, 166, 258-266 
real-valued, 5 1 
Riemann zeta, 396 
step, 52 

trigonometric, 95-107 
unbounded, 73 
vector-valued, 5 12 
Function space, 553 
Functional equation, 227 

for the exponential function, 243 
for the logarithm, 227 
Fundamental theorem of algebra, 362 
Fundamental theorems of calculus, 202 , 205,515 


GALILEO, 498 

Gamma function, 419, 421 (Exercise 19) 

GAUSS, KARL FRIEDRICH, 358, 362, 378, 473 
Gauss-Jordan elimination process, 607 
Gauss' test for convergence, 402 (Exercise 17) 
Geometric interpretation: 
of derivative as a slope, 169 
of integral as area, 65, 75, 89 
Geometric mean, 47 (Exercise 20) 

Geometric series, 388-390 

GIBBS, JOSIAH WILLARD, 445 

GRAM, J0RGEN PEDERSON, 568 

Gram-Schmidt process, 568 

Graph of a function, 51 

GRASSMANN, HERMANN, 446 

Gravitational attraction, Newton’s law of, 546 

Greatest-integer function, 63 
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GREGORY, JAMES, 390, 403 
Gregory’s series, 403 
Growth laws, 320, 321 


HADAMARD, JACQUES, 615 

Hadamard matrices, 615 (Exercise 10) 
Half-life, 313 

HAMILTON, WILLIAM ROWAN, 358, 445 

Harmonic mean, 46 
Harmonic motion, 334 
Harmonic series, 384 

HEAVISIDE, OLIVER, 445 

Helix, 523 

Heron’s formula, 493 
Higher-order derivatives, 160, 200 

HILBERT, DAVID, 471 
HOLMES, SHERLOCK, 7 

Homogeneous differential equation, 347-350 
Homogeneous property : 
of finite sums, 40 
of infinite series, 385 
of integrals, 66 

Homogeneous system of equations, 605 

HOOKE, ROBERT, 50 

Hooke’s law, 50, 1 16 

Hyperbola, 498, 500, 506 

Hyperbolic function. 251 

Hyperbolic paraboloid, 198 

Identity element : 
for addition, 18 
for multiplication, 18 
Identity, function, 51 
matrix, 600 
transformation, 579 
Implicit differentiation, 179 
Implicit function, 179 
Improper integral : 
of the first kind, 416 
of the second kind, 418 
Improper rational function, 259 
Increasing function, 76 
Increasing sequence, 38 1 
Indefinite integral, 120, 134 
Indeterminate forms, 2899302 
Induction : 

definition by, 39 
proof by, 32-37 
Inductive set, 22 
Inequality, 20 
Bernoulli. 46 

Cauchy-Schwarz, 42, 452, 563 
for the sine and cosine, 95 
triangle, 42, 364, 454, 563 


Infimum, 25 
Infinite limits, 299, 300 
Infinity, 297 
Inflection point, 191 
Initial condition, 307 
Initial-value problem, 307 
Inner product, 451, 469, 561 
Integer, 22 
Integrability: 

of a continuous function, 153 
of a monotonic function, 77 
Integral: 
curve, 341 
definite, 73, 211 
improper, 416-420 
indefinite, 120 
lower and upper, 74 
of a bounded function, 73 
of a complex-valued function, 369 
of a step function, 65 
of a vector-valued function, 513 
test, 397 
Integrand, 74 
Integration: 

by partial fractions, 258-264 
by parts, 217-220 
by substitution, 212-216 
of monotonic functions, 79 
of polynomials, 79, 81 
of rational functions, 258-264 
of trigonometric functions, 100, 207, 264 
Intercepts, 190, 495 
Intermediate-value theorem: 
for continuous functions, 144 
for derivatives, 187 (Exercise 10) 
Intersection of sets, 14 
Intervals, 60, 310 
Inverse : 

function, 146, 252 
matrix, 612 

transformation. 585 , 586 
trigonometric functions, 253-256 
Inversion, 146, 253 
Invertible transformation, 585-588 
Irrational numbers, 17, 22, 28, 31, 282 
Isoclines, 344, 348 
Isomorphism, 361, 600 
Isothermals, 198, 351 


Jump discontinuity, 131 
Jupiter, 545 


KEPLER, JOHANNES, 498, 545 
Kepler’s laws, 545, 546 
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LAGRANGE, JOSEPH LOUIS, 171, 331, 445 
Lagrange’s identity, 483 

Lagrange’s remainder in Taylor’s formula, 284 
Laplace’s equation, 305 
Lattice points, 60 (Exercise 4) 

Least squares, method of, 196 (Exercise 25) 
Least-Upper-bound axiom, 25 
Left-hand continuity, 126 
Left-hand coordinate system, 485 
Left-hand limit, 130 
Left inverse, 585, 61 1 
LEGENDRE, ADRIEN-MARIE, 571 
Legendre polynomials, 571 
LEIBNIZ, GOTTFRIED WILHELM, 3, 10, 157, 172, 
210, 222, 305, 403 

Leibniz’s formula for the nth derivative of a 
product, 222 (Exercise 4) 

Leibniz’s notation : 
for derivatives, 172 
for primitives, 210 

Leibniz’s rule for alternating series, 404 
Length: 

of a curve, 530, 531 
of a vector, 453 

other definitions of, 461 (Exercises 17, 18) 
LEONARDO of pisa (Fibonacci), 379 
Level curve, 197 

L'HOPITAL, GUILLAUME FRANCOIS ANTOINE, 292 

L’Hopital’s rule, 292-298 
Limit(s): 

infinite, 298 

left- and right-hand, 129, 130 
of a function, 128 
of a sequence, 379 
of integration, 10, 74 
theorems on, 132 
Line(s) : 

Cartesian equation of, 475 
definition of, 472 
normal vector to, 476 
parallelism of, 473 
slope of, 169, 475 
tangent, 170, 518 
vector equation of. 475 
Linear combination, 459, 556 
Linear dependence and independence, 463, 557 
Linear differential equation, 308, 322 
Linear function, 54 
Linear space (vector space), 551, 552 
Linear span, 462, 557 
Linear system of equations, 605 
Linear transformation, 578 
Linearity property: 

of convergent series, 385 
of derivatives, 164 
of integrals, 67, 80 


of Taylor operators, 276 

LOBATCHEVSKI, NIKOLAI IVANOVICH, 474 

Logarithms : 
base b, 232 

base e (Napierian or natural logarithms), 
229-232 

Logarithmic differentiation, 235 
Logarithmic function: 
calculation of, 240-242 
definition of, 227 
integration of, 235 
power series for, 390, 433 
Lorentz transformation, 614 (Exercise 6) 

Lower bound, 25 
Lower integral, 74 

MACHIN, JOHN, 285 

Major axis of an ellipse, 505 
Mass density, 118 
Mathematical induction, 32-37 
Mathematical model, 3 13 
Matrix : 

algebraic operations on, 598, 601 
definition of, 592, 598 
diagonal. 595 
representation, 592 
Maximum element, 23 
Maximum of a function : 
absolute, 150 
relative, 182 
Mean, 46, 149 
arithmetic, 46 
geometric, 47 (Exercise 20) 
harmonic, 46 
pth-power, 46, 149 
Mean distance from the sun, 546 
Mean-value theorem: 

Cauchy’s extension of, 186 
for derivatives, 185 
for integrals. 154, 219 
Measurable set, 58, 111 
MERCATOR, NICHOLAS, 377, 390 
Mercury, 545 
Minimum element, 25 
Minimum of a function: 
absolute, 150 
relative, 182 

Minor axis of an ellipse, 505 
Modulus of a complex number, 363 
Moment, 118 
of inertia, 119 
Monotone property: 
of area, 59 

of averages, 119 (Exercise 13) 
of volume, 1 12 
of work, 115 
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Monotonic function, 76 
Monotonic sequence, 381 

Motion : 

along a curve, 521 
of a rocket, 337 
simple harmonic, 334 
Multiplication: 

of functions, 55, 63, 132 

of matrices, 601 

of numbers, 17, 44, 358 

of transformations, 584 

of vectors (cross product), 483 

of vectors (inner product), 451, 469, 552 

of vectors by scalars, 447, 552 

NAPIER, JOHN, 232 

Napierian (natural) logarithms, 229-232 
Necessary and sufficient conditions, 394 
Neighborhood, 127 

newton, ISAAC, 3, 157, 171, 305, 377, 498, 522 
Newton’s law : 
of cooling, 315 
of motion, 314. 546 
of universal gravitation, 546 
Non-Archimedean geometries, 26 
Non-Euclidean geometries, 474 
Nonsingular matrix, 611 
Norm : 

of a vector, 453 

of an element in a linear space, 563 
Normal : 

to a line, 476 
to a plane, 493 

to a plane curve, 529 (Exercise 14) 
to a space curve, 526 
Notations: 

for derivatives, 160, 171, 172, 200 
for integrals, 10, 65, 69, 210, 211, 513 
for products, 44 
for sets, 12 
for sums, 37 
for vectors, 446, 512 
n-space, 446 
nth derivative, 160 
nth root, 30, 145 
Null space, 580 
Nullity, 581 
Number : 
complex, 358 
irrational, 17, 22 
rational, 17, 22, 393 
real, 17 

Odd function, 84 (Exercise 25) 

Odd integer, 28 (Exercise 10) 


One-sided limits, 129-1 30 
One-to-one correspondence, 360, 412 
One-to-one transformation, 587 
o-notation, 286 
Open interval, 60 
Operator : 
difference, 172 
differentiation, 172, 329, 579 
integration, 579 
linear, 578 
Taylor, 274 

Orbits of planets, 545-548 
Order axioms, 20 
Ordered pairs, 48, 53, 358 
Ordinate, 48 

Ordinate set, 58, 60, 61, 75 
Origin of coordinates, 48 
Orthogonal basis, 466, 568 
Orthogonal complement, 573 
Orthogonal matrix, 615 (Exercise 8) 
Orthogonal trajectory, 351 
Orthogonality: 

in a Euclidean space, 564 
of curves, 351 
of lines, 170 
of planes, 496 

of the sine and cosine, 106 (Exercise 31) 
of vectors, 455 
Orthonormal set, 466, 564 
Osculating plane, 526 

Parabola, 2, 54, 498, 500, 507 
Parabolic mirrors, 519 
Parabolic segment, area of, 3 
Paradox, Zeno’s, 374-377 
Parallelepiped, 112 
Parallelism: 
of lines, 473 
of planes, 479 
of vectors, 450 
Parallelogram law, 362, 449 
Parameter, 5 17 

Parametric equations, 475, 517 
of a circle, 521 
of a helix, 523 

of a hyperbola, 524 (Exercise 12) 
of a line, 475 
of an ellipse, 522 
PARSEVAL, MARK-ANTOINE, 566 
Parseval's formula, 566 
Partial derivatives, 196-201 
Partial fractions, integration by, 258-264 
Partial sums, 375, 383 
Partition, 61 
PASCAL, BLAISE, 3 
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Pascal’s triangle, 44 (Exercise 3) 

PEANO, GIUSEPPE, 17 

Peano postulates for the integers, 17 
Periodic function, 95 
Periodic motion, 335, 339, 546 
Permutation, 412 
Perpendicularity : 
of lines, 170 
of planes, 496 
of vectors, 455 
tt (pi) : 

computation of, 285 (Exercise 10) 
definition of, 91 

Piecewise monotonic functions, 77 
Planes, 478-482 
Polar coordinates, 108, 540 
Polar form of complex numbers, 367 
Polynomial approximations, 272-304, 575 
Polynomial functions, 55 
continuity of, 133 
differentiation of, 166 
integration of, 79, 81 
of two variables, 264 
Population growth, 320, 321 
Position function, 521, 540 
Power, functions, 54, 80 
series, 428-436 
circle of convergence, 428 
differentiation and integration of, 432 
interval of convergence, 431 
Prime numbers, 36 (Exercise 11), 50 
Primitive (antiderivative), 205, 210-219 
Product(s): 
cross, 483 

dot (inner), 451, 469, 561 
notation for, 44 
of functions, 55 
of numbers, 17, 44, 358 
scalar triple, 488 
Projections, 457, 458, 574 
Proper rational function, 259 
pth-power mean, 46, 149 
Pursuit problems, 352 
Pythagorean identity, 96, 455, 469, 573 
Pythagorean theorem, 49, 196 


Quadrant, 48 
Quadratic equation, 362 
Quadratic polynomial, 54 
Quotient, of functions, 55 
of numbers, 18, 360 


RAABE, JOSEF LUDWIG, 402 

Raabe's convergence test, 402 (Exercise 16) 


Radial acceleration, 542, 546 
Radial set, 109 
Radian measure, 102 
Radioactive decay, 3 13 
Radius : 

of convergence, 428 
of curvature, 537 
of gyration, 119 
Range of a function, 53, 578 
Rank, 581 

Rate of change, 160 
Rational function, 166 
of two variables, 264 
Rational number, 17, 22, 393 
Rational powers, 30, 135, 166, 206 
Ratio test, 400 

Real function (real-valued function), 51 

Real line (real axis), 22 

Real linear space, 552 

Real numbers, axioms for, 18-25 

Rearrangements of series, 41 1-413 

Reciprocal, 18, 360 

Rectifiable curves, 530 

Recursion formula, 220 (Exercise 8 ), 264, 379 
Recursive definition, 39 
Refinement of a partition, 62 
Related rates, 177 

Relative maximum and minimum, 182, 183 
Remainder in Taylor’s formula, 278-287 
Removable discontinuity, 13 1 
Ricatti differential equation, 312 (Exercise 19) 

RIEMANN, GEORG FRIEDRICH BERNHARD, 3, 396, 

413 

Riemann’s rearrangement theorem, 413 
Riemann zeta function, 396 
Right-hand continuity, 126, 13 1 
Right-hand coordinate system, 485 
Right-hand limit, 129 
Right inverse, 586 

ROBERVAL, GILES PERSONE DE, 3 
ROBINSON, ABRAHAM, 172 
Rocket with variable mass, 337 
ROLLE, MICHEL, 184 

Rolle’s theorem, 184 
Root mean square, 46 
Root test, 400 

Roots of complex numbers, 372 (Exercise 8 ) 
Row, matrix (row vector), 598 
operations, 608 


Scalar, 447, 468 

Scalar product (dot product), 451 
Scalar triple product, 488 
SCHMIDT, ERHARD, 568 

Sections of a cone, 497 
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SEIDEL, PHILLIPP LUDVIG VON, 423 
Separable differential equation, 345 

Sequence, 378-381, 422-426 
Series: 

absolutely convergent, 406 
alternating, 403 
conditionally convergent, 406 
convergent and divergent, 384 
differentiation of, 427, 432 
exponential, 436 
geometric, 388 
harmonic, 384 
integration of, 432 
logarithmic, 433 
pointwise convergence of, 425 
power, 389, 428 
sine and cosine, 436 
Taylor’s, 434 
telescoping, 386 
uniformly convergent, 425 
Set, function. 57 
theory, 1 1-16 

Similarity transformation, 91, 349 
Simple harmonic motion, 334 
Simultaneous linear equations, 490, 605 
Sine function: 

complex-valued, 372 (Exercise 9) 
continuity of, 134, 139 
differential equation for, 323 
differentiation of, 162 
integration of, 100, 207 
power series for, 436 
properties of, 96 
Singular matrix, 613 
Skew-symmetry, 483 
Slope of a curve, 170 
Slope of a line, 169 

Small-Span theorem (uniform continuity), 152 
Solution of a differential equation. 306 
Space spanned by a set of vectors, 556 
Speed, 521 

Sphere, volume of, 114 
Square roots, 29 

computation of, 444 (Exercises 20, 21) 
Squeezing principle for limits, 133 
Step function, 62 
integral of, 65 
Step region, 58 
STOKES, GEORGE GABRIEL, 423 
Straight lines in ti-space, 472 
Subsets, 12 

Subspace of a linear space, 556 
Substitution, integration by, 212-216 
Sunt: 

of a convergent series, 384 
of functions, 55, 63, 132 


of numbers, 18, 358 
of vectors, 447, 551 
Summation notation. 37 
Surface, 197 

Systems of linear equations, 605 

Tangent function. 103 
Tangent line, 170, 518 
Tangent vector, 518 
TARTAGLIA, 3 
TAYLOR, BROOK, 274 
Taylor polynomial, 274-277 
Taylor’s formula with remainder, 278 
Taylor’s series of a function. ,434 
Telescoping property: 
of finite sums, 40 
of infinite series, 386 
of products, 45 

TORRICELLI, EVANGELISTA, 3 

Tractrix, 353 

Transpose of a matrix, 615 (Exercise 7) 
Transverse axis of a hyperbola, 505 
Triangle inequality : 

in a Euclidean space, 563 
for complex numbers, 364 
for real numbers, 42 
for vectors, 454 
Trigonometric functions: 

complex-valued, 372 (Exercise 9) 
continuity of, 134, 139 
differentiation of, 162 
fundamental properties of, 95 
geometric description of, 102-104 
graphs of. 107 
integration of, 100 
power series for, 436 

Unbounded function, 73, 416 

Unbounded sequence, 3 8 1 

Undetermined coefficients, 332, 333, 441 

Uniform continuity theorem, 152 

Uniform convergence, 424 

Union of sets, 13 

Uniqueness theorems, 309, 324 

Unit coordinate vectors, 459 

Unit tangent vector, 525 

Unitary space, 562 

Upper bound, 23 

Upper integral, 74 

Variation of parameters, 331 

Vector(s): 

addition and subtraction, 447 
angle between, 458, 470 (Exercise 7) 
components of, 446 
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Vector(s) ( Contd .) 

cross product of, 483 
direction of, 450 

dot product (inner product) of, 451 
equality of, 447 
geometric, 448 
length (norm) of, 453 
multiplication by scalars, 447 
orthogonality of, 455 
parallelism of, 450 
Vector space (linear space), 552 
Velocity, 159, 521 
in polar coordinates, 541 
Venn diagram, 13 
Venus, 545 

Vertex, of ellipse or hyperbola, 505 
of a parabola, 507 
Vibrations, 335 
Volume : 

axiomatic definition of, 112 
solids of known cross section, 113 
solids of revolution, 113, 114 


WALLIS, JOHN, 3 

WEIERSTRASS, KARL, 17, 423, 427 
Weierstrass M-test for uniform convergence, 
427 

Weighted average of a function, 118 
Weighted mean-value theorem, 154 
Well-ordering principle, 34 
Work, 115, 116 

WRONSKI, J. M. HOENE, 328 

Wronskian, 328 (Exercise 21), 330 


zeno, 374 

Zeno’s paradox, 374-377 
Zero, 18 

complex number, 359 
element in a linear space, 552 
matrix, 599 
transformation, 579 
vector, 447 

Zero-derivative theorem, 205, 369 
Zeta function of Riemann, 396 






